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I. REAL PARTY IN INTEREST 

The real party in interest is Genentech, Inc., South San Francisco, California, by an 
assignment of the parent application, U.S. Patent Application Serial No. 09/941,992 recorded 
November 16, 2001, at Reel 012176 and Frame 0450. 

IL RELATED APPEALS AND INTERFERENCES 

The claims pending in the current application are directed to polypeptides referred to 
herein as "PROl 111". There exist two related patent applications, (1) U.S. Patent Application 
Serial No. 09/989,279, now Patent No. 7,083,978, issued 08-01-2006 (containing claims directed 
to nucleic acids encoding PROl 1 1 1 polypeptides), and (2) U.S. Patent Application Serial No. 
09/990,562, filed November 14, 2001 (containing claims directed to PROl 1 1 1 antibodies), now 
abandoned. 

III. STATUS OF CLAIMS 

Claims 124-126 and 129-133 are in this application. 
Claims 1-123, 127 and 128 have been canceled. 

Claims 124-126 and 129-133 stand rejected and Appellants appeal the rejection of these 

claims. 

IV. STATUS OF AMENDMENTS 

A summary of the prosecution history for this case is as follows: 

Previously, in response to a Final Office Action mailed on December 26, 2006, a Notice 
of Appeal was filed June 20, 2007. A Preliminary Amendment with Request for Continued 
Examination was subsequently filed on January 18, 2008. This was followed by another Final 
Office action mailed April 7, 2008. A Notice of Appeal was filed on September 5, 2008 in this 
case. 

Claims 132 and 133 have been amended in a supplemental amendment/response to the 
final Office Action of April 7, 2008 filed concurrently with the present appeal. A copy of the 
rejected claims in the present Appeal is provided in the Claims Appendix, incorporating the 
amendment. 
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V. SUMMARY OF CLAIMED SUBJECT MATTER 

The invention claimed in the present application is related to an isolated polypeptide 
comprising the amino acid sequence of the polypeptide of SEQ ID NO: 229, referred to in the 
present application as "PROl 111". The PROl 1 1 1 gene was shown for the first time in the 
present application to be significantly amplified in human lung and colon cancers as compared to 
normal, non-cancerous human tissue controls (Example 170). This feature is specifically recited 
in independent Claim 124, and carried by all claims dependent from Claim 124. In addition, the 
invention also claims the amino acid sequence of the polypeptide of SEQ ID NO: 229, lacking its 
associated signal-peptide (Claims 124(b) and 126); or the amino acid sequence of the 
polypeptide encoded by the full-length coding sequence of the cDNA deposited under ATCC 
accession number 2031 10 (Claims 124(c) and 129). The amino acid sequence of the native 
"PROl 111" polypeptide and the nucleic acid sequence encoding this polypeptide (referred to in 
the present application as "DNA58721-1475") are shown in the present specification as SEQ ID 
NOs: 229 and 228, respectively, and in Figures 157 and 156, described on pages 146-148. The 
full-length PROl 1 1 1 polypeptide having the amino acid sequence of SEQ ID NO: 229 is 
described in the specification at, for example, on pages 19-20, pages 146-148 and page 353; and 
the isolation of cDNA clones encoding PROl 1 1 1 of SEQ ID NO: 229 is described in Example 
67, page 4574 of the specification. 

The invention is further directed to a chimeric polypeptide comprising one of the above 
polypeptides fused to a heterologous polypeptide (Claim 130), and to a chimeric polypeptide 
wherein the heterologous polypeptide is an epitope tag or an Fc region of an immunoglobulin 
(Claim 131). The preparation of chimeric PRO polypeptides (Claims 130 and 131), including 
those wherein the heterologous polypeptide is an epitope tag or an Fc region of an 
immunoglobulin, is set forth in the specification at page 374, lines 24 to page 375, line 9. 
Examples 140-143 and page 376, line 12 onwards describe the expression of PRO polypeptides 
in various host cells, including E. coli, mammalian cells, yeast and Baculovirus-infected insect 
cells. 

Finally, Example 170, in the specification at page 539, line 19, to page 555, line 5, sets 
forth a 'Gene Amplification assay' which shows that the PROl 1 1 1 gene is amplified in the 
genome of certain human lung and colon cancers (see Table 9 A). The profiles of various 
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primary lung and colon tumors used for screening the PRO polypeptide compounds of the 
invention in the gene amplification assay are summarized on Table 8, page 546 of the 
specification. 

VI. GROUNDS OF REJECTION TO BE REVIEWED ON APPEAL 

1 . Whether Claims 124-126 and 129-133 are entitled to the priority date of U.S. 
Provisional Patent Application Serial No. 60/141,037, filed June 23, 1999. 

2. Whether instant Claims 124-126 and 129-133 satisfy the utility/enablement 
requirement under 35 U.S. C. §§101/1 12, first paragraph. 

3. Whether Claims 132-133 satisfy the enablement requirement of 35 USC §112, 
first paragraph. 

4. Whether Claims 132-133 satisfy the written description requirement of 35 USC 
§112, first paragraph. 

5. Whether Claims 132-133 particularly point out and distinctly claim the subject 
matter of the invention as required under 35 USC §112, second paragraph. 

6. Whether Claims 124-126, 129 and 132-133 are anticipated under 35 U.S.C. 
§ 1 02(a) by Wang et al , Genbank Accession No. AF 1 96976 (October 1 999). 

7. Whether Claims 1 19-123 and 132-133 are anticipated under 35 U.S.C. § 102(a) by 
Jacobs et al, Genbank Accession No. AAY28806 (October 1999). 

8. Whether Claims 130-133 are anticipated under 35 U.S.C. § 1 02(a) by Jacobs et al , 
WO 99/50405 (October 1999). 

9. Whether Claims 124, 127 and 130-133 are anticipated under 35 U.S.C. § 102(e) by 
Shimkets et al , US Patent No. 6,689,866 (March 2000). 

10. Whether Claims 130 and 132-133 are made obvious under 35 U.S.C. § 103(a) by 
any one of Loci AI769814, AI435407, AI470931, or T15752 in view of Sibson et al, WO 
94/01548 (January 1994). 

11. Whether Claim 131 is made obvious under 35 U.S.C. § 103(a) by any one of Loci 
AI769814, AI435407, AI470931, or T15752 in view of Sibson et al and further in view of 
Capon et al, US Patent No. 5,1 16,964 (May 1992). 
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12. Whether Claims 130 and 131 are made obvious under 35 U.S.C. § 103(a) by Wang 
et ai, Genbank Accession No. AF 196976 (October 1999), in view of Capon et aL, US Patent 
No. 5,116,964 (May 1992). 

VII. ARGUMENTS 
Summary of the Arguments: 

Issue 1: Priority 

Contrary to the Examiner's arguments structured around the assumption that utility is 
based on the chondrocyte redifferentiation assay, Appellants submit that they rely on the 'gene 
amplification 1 assay (Example 170), not chondrocyte redifferentiation assay for patentable utility 
of the instantly claimed subject matter. As a consequence of this incorrect assumption, the 
instant application has not been granted the earlier priority date of U.S. Provisional Patent 
Application Serial No. 60/141,037, filed June 23, 1999. The gene amplification utility was first 
disclosed in Example 23 in the U.S. Provisional Patent Application Serial No. 60/141,037, filed 
June 23, 1999, priority for which has been claimed in this application and relevant pages of 
which have been submitted to the Examiner with the previous response. Appellants believe that 
they are at least entitled to an effective filing date of June 23, 1999 based on the results of the 
'gene amplification' assay for the present case. 

In response to Appellants' numerous attempts to clarify the basis of utility, the Examiner 
has objected to the June 23, 1999 priority date on the grounds that the 60/141,037 application 
fails to provide a utility and lacks an enabling disclosure for the claimed invention under 35 
U.S.C. §§101/1 12, first paragraph. Appellants submit that, for the same reasons discussed below 
under Issue 2, U.S. Provisional Patent Application Serial No. 60/141,037 also satisfies the utility 
requirements. Therefore, Appellants should be entitled to the priority date of June 23, 1999. 

Issue 2: Utility/ Enablement 

Appellants rely upon the gene amplification data of the PROl 1 1 1 gene for patentable 
utility of the PROl 1 1 1 polypeptides. This data is clearly disclosed in the instant specification in 
Example 170 which discloses that the gene encoding PROl 1 1 1 showed siRnificant amplification, 
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ranging from 2.0705 to 2.99 fold in seven different lung primary tumors and 2.0705 to 2.603 
fold in four different colon primary tumors . Example 1 70 in the instant specification discloses 
that, "(amplification is associated with overexpression of the gene product, indicating that the 
polypeptides are useful targets for therapeutic intervention in certain cancers such as colon, lung, 
breast and other cancers and diagnostic determination of the presence of those cancers" 
(emphasis added). Appellants have also submitted, with their Response filed November 8, 2004, 
the Declaration of Dr. Audrey Goddard, which explains that a gene identified as being amplified 
at least 2-fold by the disclosed gene amplification assay in a tumor sample relative to a normal 
sample is useful as a marker for the diagnosis of cancer , for monitoring cancer development 
and/or for measuring the efficacy of cancer therapy. Therefore, Appellants submit that one of 
skill in the art would reasonably expect in this instance, based on the amplification data for the 
PROl 1 1 1 gene, that the PROl 1 1 1 polypeptide is concomitantly over expressed and has utility in 
the diagnosis of lung and colon cancers or for individuals at risk for developing lung and colon 
cancers. 

The Examiner has asserted that it does not necessarily follow that an increase in gene 
copy number results in increased gene expression and increased protein expression, such that a 
polypeptide or the antibody that binds it would be useful diagnostically. In support of these 
assertions, the Examiner has referred to articles by Sen et al, Pennica et al, Konopka et al, 
Haynes et al., Hu et al, Godbout et al. and Li et al. 

Appellants submit that the Examiner has applied an improper legal standard when 
making this rejection. The evidentiary standard to be used throughout ex parte examination in 
setting forth a rejection is a preponderance of the totality of the evidence under consideration. 
Thus, to overcome the presumption of truth that an assertion of utility by the Applicant enjoys, 
the Examiner must establish that it is more likely than not that one of ordinary skill in the art 
would doubt the truth of the statement of utility. Only after the Examiner has made a proper 
prima facie showing of lack of utility, does the burden of rebuttal shift to the Applicant. 

The references cited by the Examiner do not suffice to make a prima facie case that it is 
more likely than not that a generalized correlation does not exist between gene (DNA) 
amplification and increased polypeptide levels. In particular, the teachings of Sen et al., Pennica 
et al, Konopka et al, Haynes et al., Hu et al., Godbout et al and Li et al. do not establish that 
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amplification of a gene, whether by aneuploidy or any other mechanism, is not useful as a 
diagnostic marker. Nor do these references suffice to show that a lack of correlation between 
gene amplification data and the biological significance of cancer genes is typical Thus, these 
teachings cannot support a general conclusion regarding correlation between gene amplification 
and mRNA or protein levels. 

In contrast, Appellants have submitted ample evidence to show that, in general, if a gene 
is amplified in cancer, it is more likely than not that the encoded protein will be expressed at an 
elevated level. First, the articles by Orntoft et ai, Hyman et al. 9 and Pollack et al. 9 submitted in 
their Response filed November 8, 2004, collectively teach that in general gene amplification 
increases mRNA expression . Second, Appellants have submitted over a hundred references, 
along with Declarations of Dr. Paul Polakis, which collectively teach that, in general there is a 
correlation between mRNA levels and polypeptide levels . Appellants would also like to bring to 
the Board's attention a recent decision in a microarray case by the Board of Patent Appeals and 
Interferences (Decision on Appeal No. 2006-1469). In its decision, the Board reversed the utility 
rejection, acknowledging that "there is a strong correlation between mRNA levels and protein 
expression, and the Examiner has not presented any evidence specific to the PRO 1866 
polypeptide to refute that." (Page 9). Appellants submit that, in the instant application, the 
Examiner has likewise not presented any evidence specific to the PROl 111 polypeptide to refute 
Appellant's assertion of a correlation between DNA levels, mRNA levels and protein expression. 

Appellants further submit that, as evidenced by the Ashkenazi Declaration and the 
teachings of Hanna and Mornin (both made of record in Appellants' Response filed November 8, 
2004), simultaneous testing of gene amplification and gene product over-expression enables 
more accurate tumor classification , even if the gene-product, the protein, is not over-expressed. 
This leads to better determination of a suitable therapy for the tumor, as demonstrated by a real- 
world example of the breast cancer marker HER-2/neu. Therefore, as a general rule, one skilled 
in the art would find it more likely than not that PROl 1 1 1 polypeptides are useful as a diagnostic 
tools for detecting lung and colon tumors. 

Taken together, although there are some examples in the scientific art that do not fit 
within the central dogma of molecular biology that there is generally a positive correlation 
between DNA, mRNA, and polypeptide levels, in general, in the majority of amplified genes , the 
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art overwhelmingly shows that gene amplification influences gene expression at the mRNA and 
protein levels . Therefore, one of skill in the art would reasonably expect in this instance, based 
on the amplification data for the PROl 1 1 1 gene, that the PROl 1 1 1 polypeptide is concomitantly 
overexpressed and has utility in the diagnosis of lung and colon cancers. 

As the Examiner no longer questions whether mRNA levels are predictive of polypeptide 
levels, the evidence presented by Appellants support that gene amplification correlates with the 
increased protein expression. Based on the 2.07-fold to 2.99-fold amplification in colon and 
lung primary tumors, one of ordinary skill would find it credible that the claimed PROl 1 1 1 
polypeptides would have utility as markers for the diagnosis of lung and colon tumors , and for 
monitoring cancer development and/or for measuring the efficacy of cancer therapy. 

Accordingly, Appellants submit that when the proper legal standard is applied, one 
should reach the conclusion that the present application discloses at least one patentable utility 
for the claimed PROl 1 1 1 polypeptides. Further, one of ordinary skill in the art would also 
understand how to make and use the recited polypeptides for the diagnosis of lung and colon 
cancers without any undue experimentation. 

4 

Issue 3: Enablement 

Claims 132-133 further stand rejected under 35 U.S.C. §112, first paragraph as allegedly 
lacking enablement for the claimed polypeptide variants. 

Appellants submit that, as discussed above, the PROl 1 1 1 polypeptides have utility in the 
diagnosis of cancer. Based on such a utility, one of skill in the art would know exactly how to 
use the claimed polypeptides for diagnosis of cancer, without any undue experimentation. 

Appellants note that the claimed variants, in addition to having at least 95% amino acid 
sequence identity to SEQ ID NO:229, also must satisfy the functional limitation that " the nucleic 
acid encoding said polypeptide is amplified in lung or colon tumor ." Thus, the claimed variants 
all share the disclosed utility of the PROl 1 1 1 polypeptide in being useful for the diagnosis of 
cancer . The specification provides ample guidance to allow the skilled artisan to identify those 
polypeptide variants which meet the limitations of the claims, including a detailed protocol for 
the gene amplification assay, and detailed guidance as to how to identify and make polypeptides 
having at least 95% amino acid sequence identity to PROl 1 1 1 (SEQ ID NO:229). Accordingly, 
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one of ordinary skill in the art would understand how to make and use the recited polypeptide 
variants without undue experimentation. 

Issue 4: Written Description 

Claims 132-133 stand rejected under 35 U.S.C. §1 12, first paragraph as allegedly lacking 
adequate written description for the claimed variant polypeptides. In particular, the Examiner 
has asserted that "[t]here is written description of a single species only, SEQ ID NO: 229. There 
is no written description or conception of any other proteins encoded by nucleic acids..." (Page 5 
of the Office Action mailed April 7, 2008). 

Appellants respectfully submit that the instant claims are similar to the exemplary claim 
in Example 10 of the revised Training Manual on Written Description Guidelines issued by the 
U.S. Patent Office. Appellants respectfully submit that the instant specification evidences the 
actual reduction to practice of the amino acid sequence of SEQ ID NO: 229. Thus, the genus of 
polypeptides with at least 95% sequence identity to SEQ ID NO:229 5 would meet the 
requirement of 35 U.S.C. §112, first paragraph, as providing adequate written description. 

Issue 5: Indefiniteness 

Claims 132-133 stand rejected under 35 USC §112, second paragraph, for allegedly 
failing to particularly point out and distinctly claim the subject matter of the invention. 

Appellants submit that the instant specification provides a detailed description for 
identifying the genus of nucleic acids that code for the polypeptide of SEQ ID NO:229 with 95% 
similarity and further, which possess the functional property that it is "wherein the nucleic acid 
encoding said polypeptide is amplified in adenocarcinomas or squamous cell carcinomas of the 
lung or in adenocarcinomas of the colon." It also provides step-by-step guidelines and protocols 
for testing these DNA in the gene amplification assay, a PCR based assay at least in Example 
170. Accordingly, the metes and bounds of proteins encoded by such nucleic acids are clearly 
defined such that one skilled in the art would know how to make the invention. 

Issues 6-9: Anticipation 

The instant claims stand rejected under 35 U.S.C. § 102(a) as being anticipated separately 
by Wang et al. (Genbank Accession No. AF 196976; October 1999), Jacobs et al (Genbank 
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Accession No. AAY28806; October 1999), and Jacobs et al (WO 99/50405; October 1999). 
The instant claims further stand rejected under 35 U.S.C. § 102(e) as being anticipated by 
Shimkets et al (US Patent No. 6 5 68 9,866; March 2000). 

Appellants submit that they rely on the 'gene amplification' assay (Example 170) for 
patentable utility of the instantly claimed subject matter. This utility was first disclosed in 
Example 23 in the U.S. Provisional Patent Application Serial No. 60/141,037, filed 
June 23, 1999, priority for which has been claimed in this application. Appellants are at least 
entitled to an effective filing date of June 23, 1999 based on the results of the 'gene 
amplification' assay for the currently pending claims, and this date precedes the publication date 
for Wang et al Therefore, Wang et al is not prior art under 35 U.S.C. § 102(a), and hence this 
rejection should be withdrawn. 

The instant application has not been granted the earlier priority date on the grounds that 
allegedly the subject matter of the claims is not disclosed in the manner provided by 35 U.S.C. 
112, first paragraph for the reasons set forth above in the rejections under 35 U.S.C. 101/112, 
first paragraph. Appellants respectfully submit that as discussed above under Issues 1 and 2, the 
presently claimed invention is supported by a specific, substantial and credible utility, said utility 
being disclosed and supported in parent application 60/141,037, and, therefore, the present 
specification and its parent teach one of ordinary skill in the art "how to use" the claimed 
invention without undue experimentation. Accordingly, the instant application is entitled to the 
effective filing date of June 23, 1999, and thus Wang and both Jacobs references are not prior art. 

Issues 10-12: Obviousness 

The instant claims stand rejected under 35 U.S.C. § 103(a) as allegedly being 
unpatentable over any one of Loci AI769814, AI435407, AI470931, or T15752 in view of 
Sibson et al (WO 94/01548; January 1994) and further in view of Capon et al (US Patent No. 
5,1 16,964; May 1992). Claims 130 and 131 are also alleged to be obvious under 35 U.S.C. 
§103 (a) by Wang et al (Genbank Accession No. AF 196976; October 1999), in view of Capon et 
al (US Patent No. 5,116,964; May 1992). 

Appellants believe that AI769814, AI435407, AI470931 or T15752 are not prior art. 
Appellants maintain that the instant case is directed to polypeptides, particularly, to the 
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polypeptide of SEQ ID NO:229, and not to nucleic acids. Appellants note that the polypeptide 
sequences encoded by AI769814, AI435407, AI47093 1 or T 15752 were not reduced to practice 
in the cited art nor did the art provide any disclosure whatsoever of the full-length polypeptide 
encoded by any of these nucleic acid fragments. Hence, this rejection for the instant polypeptide 
case based on nucleic acid ESTs alone is not appropriate and therefore, AI769814, AI435407, 
AI47093 1 or Tl 5752 are not prior art. Appellants further point out that locus AI769814 has a 
publication date of December 21, 1999. For the reasons discussed above, Appellants are at least 
entitled to an effective filing date of June 23, 1999 based on the results of the 'gene amplification' 
assay for the currently pending claims. Locus AI769814 is dated after the effective filing date 
of June 23, 1999. Therefore, Locus AI769814 is not prior art and these rejections should be 
withdrawn. 

As discussed above, loci AI769814, AI435407, AI470931 and T15752 do not disclose 
each and every limitation of Claims 130 and 132-133. Further, neither Sibson et al nor Capon et 
al cure the deficiencies of loci AI769814, AI435407, AI470931 and T15752. Since the primary 
references falls as a prior art references, Appellants respectfully submit that the instant claims are 
not obvious over loci AI769814, AI435407, AI470931 and T15752 in view of Sibson et al or 
further in view of Capon et al 

With respect to Wang et al in view of Capon et al , for the reasons discussed above, 
Appellants maintain that they are at least entitled to an effective filing date of June 23, 1999 
based on the results of the 'gene amplification' assay for the currently pending claims, and this 
date precedes the publication date for Wang et a\. Therefore, Wang et al is not prior art. Since 
the primary reference falls as a prior art reference, Appellants respectfully submit that the instant 
claims are not obvious over Wang et al in view of Capon et al 

These arguments are all discussed in further detail below under the appropriate headings. 
Detailed Response to Rejections 
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ISSUE 1: Claims 124-126 and 129-133 should be entitled to the priority date of U.S. 
Provisional Patent Application Serial No. 60/141,037, filed June 23, 1999 

The instant application has not been granted the earlier priority date of U.S. Provisional 
Patent Application Serial No. 60/141,037, filed June 23, 1999 on the grounds that the prior 
60/141,037 application fails to provide a utility and lacks an enabling disclosure for the claimed 
invention under 35 U.S.C. §§101/1 12, first paragraph. 

Appellants disagree and submit that, for the reasons discussed below under Issue 2, U.S. 
Provisional Patent Application Serial No. 60/141,037 also satisfies the utility requirements. 
Therefore, Appellants should be entitled to the priority date of June 23, 1999. 

ISSUE 2: Claims 124-126 and 129-133 are Supported by a Credible, Specific and 
Substantial Asserted Utility, and Thus Meet the Utility Requirement of 35 U.S.C. §101 and 
the "How to Use Prong" of the Enablement Requirement of 35 U.S.C. §112, First 
Paragraph 

The basis for the Examiner's rejection of Claims 124-126 and 129-133 under these 
sections is that the data presented in Example 1 70 of the present specification is allegedly 
insufficient under applicable legal standards to establish a patentable utility under 35 U.S.C. 
§101 for the presently claimed subject matter, and further, since a patentable utility has not been 
established, one would not know how to use the claimed invention. 

Appellants strongly disagree and respectfully traverse the rejection. 

A. The Legal Standard For Utility Under 35 U.S.C. $101 

According to 35 U.S.C. §101: 

Whoever invents or discovers any new and useful process, machine, manufacture, 
or composition of matter, or any new and useful improvement thereof, may obtain 
a patent therefor, subject to the conditions and requirements of this title. 
(Emphasis added). 

In interpreting the utility requirement, in Brenner v. Manson, 1 the Supreme Court held 
that the quid pro quo contemplated by the U.S. Constitution between the public interest and the 
interest of the inventors required that a patent Applicant disclose a "substantial utility" for his or 



' Brenner v. Manson, 383 U.S. 519, 148 U.S.P.Q. (BNA) 689 (1966). 
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her invention, i.e., a utility "where specific benefit exists in currently available form." The 
Court concluded that "a patent is not a hunting license. It is not a reward for the search, but 
compensation for its successful conclusion. A patent system must be related to the world of 

3 

commerce rather than the realm of philosophy." 

4 

Later, in Nelson v. Bowler, the C.C.P.A. acknowledged that tests evidencing 
pharmacological activity of a compound may establish practical utility, even though they may 
not establish a specific therapeutic use. The Court held that "since it is crucial to provide 
researchers with an incentive to disclose pharmaceutical activities in as many compounds as 
possible, we conclude adequate proof of any such activity constitutes a showing of practical 

utility." 5 

6 

In Cross v. Iizuka, the C.A.F.C. reaffirmed Nelson, and added that in vitro results might 
be sufficient to support practical utility, explaining that "in vitro testing, in general, is relatively 
less complex, less time consuming, and less expensive than in vivo testing. Moreover, in vitro 
results with the particular pharmacological activity are generally predictive of in vivo test results, 

7 

i.e., there is a reasonable correlation there between." The Court perceived, "No insurmountable 
difficulty" in finding that, under appropriate circumstances, "in vitro testing, may establish a 

practical utility." 



2 Id at 534, 148 U.S.P.Q. (BNA) at 695. 

3 Id. at 536, 148 U.S.P.Q. (BNA) at 696. 

4 Nelson v. Bowler, 626 F.2d 853, 206 U.S.P.Q. (BNA) 881 (C.C.P.A. 1980). 

5 Id at 856, 206 U.S.P.Q. (BNA) at 883. 

6 Cross v. Iizuka, 753 F.2d 1047, 224 U.S.P.Q. (BNA) 739 (Fed. Cir. 1985). 

* 

7 Id at 1050, 224 U.S.P.Q. (BNA) at 747. 

8 

Id 
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The case law has also clearly established that Appellants' statements of utility are usually 

9 

sufficient, unless such statement of utility is unbelievable on its face. The PTO has the initial 

burden to prove that Appellants' claims of usefulness are not believable on their face. 10 In 
general, an Appellant's assertion of utility creates a presumption of utility that will be sufficient 
to satisfy the utility requirement of 35 U.S.C. §101, "unless there is a reason for one skilled in 

11 12 

the art to question the objective truth of the statement of utility or its scope." ' 

13 

Compliance with 35 U.S.C. §101 is a question of fact. The evidentiary standard to be 
used throughout ex parte examination in setting forth a rejection is a preponderance of the 

14 

totality of the evidence under consideration. Thus, to overcome the presumption of truth that 
an assertion of utility by the Appellant enjoys, the Examiner must establish that it is more likely 
than not that one of ordinary skill in the art would doubt the truth of the statement of utility. 
Only after the Examiner made a proper prima facie showing of lack of utility, does the burden of 
rebuttal shift to the Applleant. The issue will then be decided on the totality of evidence. 

The well established case law is clearly reflected in the Utility Examination Guidelines 

15 

("Utility Guidelines"), which acknowledge that an invention complies with the utility 
requirement of 35 U.S.C. §101, if it has at least one asserted "specific, substantial, and credible 
utility" or a "well-established utility." Under the Utility Guidelines, a utility is "specific" when 
it is particular to the subject matter claimed. For example, it is generally not enough to state that 



In re Gazave, 379 F.2d 973, 154 U.S.P.Q. (BNA) 92 (C.C.P.A. 1967). 

10 

Ibid 

11 In re Longer, 503 F,2d 1380,1391, 183 U.S.P.Q. (BNA) 288, 297 (C.C.P.A. 1974). 

12 See also In re Jolles, 628 F.2d 1322, 206 USPQ 885 (C.C.P.A. 1980); In re Irons, 340 F.2d 974, 144 
USPQ 351 (1965); In re Sichert, 566 F.2d 1154, 1159, 196 USPQ 209, 212-13 (C.C.P.A. 1977). 

13 Raytheon v. Roper, 724 F.2d 951, 956, 220 U.S.P.Q. (BNA) 592, 596 (Fed. Cir. 1983) cert denied, 469 
US 835 (1984). 

U In re Oetiker, 977 F.2d 1443, 1445, 24 U.S.P.Q.2d (BNA) 1443, 1444 (Fed. Cir. 1992). 
15 66 Fed. Reg. 1092(2001). 
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a nucleic acid is useful as a diagnostic without also identifying the conditions that are to be 
diagnosed. 

In explaining the "substantial utility" standard, M.P.E.P. §2107.01 cautions, however, 
that Office personnel must be careful not to interpret the phrase "immediate benefit to the 
public" or similar formulations used in certain court decisions to mean that products or services 
based on the claimed invention must be "currently available" to the public in order to satisfy the 
utility requirement. "Rather, any reasonable use that an applicant has identified for the invention 
that can be viewed as providing a public benefit should be accepted as sufficient, at least with 

regard to defining a 'substantial' 'utility.'" 16 Indeed, the Guidelines for Examination of 

17 

Applications for Compliance With the Utility Requirement, gives the following instruction to 
patent examiners: "If the Applicant has asserted that the claimed invention is useful for any 
particular practical purpose . . . and the assertion would be considered credible by a person of 
ordinary skill in the art, do not impose a rejection based on lack of utility." 

B. Proper Application of the Legal Standard 

Appellants submit that the evidentiary standard to be used throughout ex parte 
examination of a patent application is a preponderance of the totality of the evidence under 
consideration. Thus, to overcome the presumption of truth that an assertion of utility by the 
Appellant enjoys, the Examiner must establish that it is more likely than not that one of ordinary 
skill in the art would doubt the truth of the statement of utility. Only after the Examiner has 
made a proper prima facie showing of lack of utility, does the burden of rebuttal shift to the 
Appellant. 

Appellants respectfully submit that the data presented in Example 1 70 starting on page 
539 of the specification and the cumulative evidence of record support a "specific, substantial 
and credible" asserted utility for the presently claimed invention. 

Patentable utility for the PROl 1 1 1 polypeptides is based upon the gene amplification 
data for the gene encoding the PROl 1 1 1 polypeptide of SEQ ID NO:229. Example 170 

16 M.P.E.P. §2107.01. 

17 M.P.E.P. §2107 11(B)(1). 
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describes the results obtained using a very well-known and routinely employed polymerase chain 
reaction (PCR)-based assay, the TaqMan PCR assay, also referred to herein as the gene 
amplification assay. This assay allows one to quantitatively measure the level of gene 
amplification in a given sample, say, a tumor extract, or a cell line. It was well known in the art 
at the time the invention was made that gene amplification is an essential mechanism for 
oncogene activation. Appellants isolated genomic DNA from a variety of primary cancers and 
cancer cell lines that are listed in Table 9 (pages 539 onwards of the specification), including 
primary lung and colon cancers of the type and stage indicated in Table 8 (page 546). The tumor 
samples were tested in triplicates with Taqman primers and with internal controls, beta-actin 
and GADPH in order to quantitatively compare DNA levels between samples (page 548, lines 
33-34). As a negative control, DNA was isolated from the cells of ten normal healthy 
individuals, which was pooled and used as a control (page 539, lines 27-29) and also, no- 
template controls (page 548, lines 33-34). The results of TaqMan™ PCR are reported in ACt 
units, as explained in the passage on page 539, lines 37-39. One unit corresponds to one PCR 
cycle or approximately a 2-fold amplification, relative to control, two units correspond to 4-fold, 
3 units to 8-fold amplification and so on. Using this PCR-based assay, Appellants showed that 
the gene encoding for PROl 111 was amplified, that is, it showed approximately 1 .05-1 .58 ACt 
units in seven lung tumors and 1 .05-1 .38 ACt units in four colon tumors which corresponds to 
2 1 .05 ^ 1 58 _ fold amplification in lung and 2 1 05 -2 1 38 - fold amplification in colon tumors 
respectively, or 2.0705 to 2.99 fold in seven different lunR primary tumors and 2.0705 to 2.603 
fold in four different colon primary tumors . 

The Examiner has asserted that "it remains that the amplification was minimal and that 
the most parsimonious explanation is aneuploidy, with no evidence that the chromosome bearing 
PROl 111 was preferentially amplified (as opposed to other chromosomes). " (Pages 2-3 of the 
Final Office Action mailed December 26, 2006), 

Appellants submit that the Examiner seems to have applied a heightened utility standard 
in this instance, which is legally incorrect. Appellants have shown that the gene encoding 
PROl 1 1 1 demonstrated significant amplification, from 2.07 to 2.99 fold , in several different 
primary lung and colon tumors. Appellants have submitted a Declaration by Dr. Audrey 
Goddard (made of record November 9, 2005) which provides a statement by an expert in the 

-16- 

On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 09/991,163 
Attorney's Docket No. GNE-2730 P1C17 



relevant art that "fold amplification" values of at least 2-fold are considered significant in the 

TaqMan™ PCR gene amplification assay. Appellants particularly draw the Board's attention to 

page 3 of the Goddard Declaration which clearly states that: 

It is further my considered scientific opinion that an at least 2-fold increase in 
gene copy number in a tumor tissue sample relative to a normal (i.e., non-tumor) 
sample is significant and useful in that the detected increase in gene copy number 
in the tumor sample relative to the normal sample serves as a basis for using 
relative gene copy number as quantitated by the TaqMan PCR technique as a 
diagnostic marker for the presence or absence of tumor in a tissue sample of 
unknown pathology. Accordingly, a gene identified as being amplified at least 2- 
fold by the quantitative TaqMan PCR assay in a tumor sample relative to a normal 
sample is useful as a marker for the diagnosis of cancer, for monitoring cancer 
development and/or for measuring the efficacy of cancer therapy. 
(Emphasis added). 

Accordingly, the 2.07 fold to 2.99-fold in eleven different lung and colon tumors would 
be considered significant and credible by one skilled in the art, based upon the facts disclosed in 
the Goddard Declaration. 

By referring to the 2.07-fold to 2.99-fold amplification of the PROl 1 1 1 gene in lung 

tumors as "minimal," the Examiner appears to ignore the teachings within an expert's declaration 

without any basis, or without presenting any evidence to the contrary . Appellants respectfully 

draw the Board's attention to the Utility Examination Guidelines (Part IIB, 66 Fed. Reg. 1098 

(2001)) which state that: 

Office personnel must accept an opinion from a qualified expert that is based 
upon relevant facts whose accuracy is not being questioned; it is improper to 
disregard the opinion solely because of a disagreement over the significance or 
meaning of the facts offered. 

In addition, the case law has clearly established that in considering affidavit evidence, the 

Examiner must consider all of the evidence of record anew. 1 8 "After evidence or argument is 
submitted by the Applicant in response, patentability is determined on the totality of the record, 



18 In re Rinehart, 531 F.2d 1084, 189 U.S.P.Q. 143 (C.C.P.A. 1976) and In re Piasecki, 
745 F.2d. 1015, 226 U.S.P.Q. 881 (Fed. Cir. 1985). 
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by a preponderance of the evidence with due consideration to persuasiveness of argument" 1 y 
Furthermore, the Federal Court of Appeals held in In re Alton, "We are aware of no reason why 

opinion evidence relating to a fact issue should not be considered by an Examiner" 20, 

Thus, given the absence of any evidence to the contrary, Appellants maintain that the 
2.07-fold to 2.99-fold amplification disclosed for the PROl 1 1 1 gene is significant and forms the 
basis for the utility claimed herein. In addition, the Goddard Declaration clearly establishes that 
the TaqMan real-time PCR method described in Example 170 has gained wide recognition for its 
versatility, sensitivity and accuracy, and is in extensive use for the study of gene amplification. 
The facts disclosed in the Declaration also confirm that based upon the gene amplification 
results, one of ordinary skill would find it credible that PROl 111 is a diagnostic marker of lung 
cancer. 

Appellants' position is further based on the overwhelming evidence from gene (DNA) 
amplification data disclosed in the specification which clearly indicate that the gene encoding 
PROl 1 1 1 is significantly amplified in certain lung and colon tumors. Based on the working 
hypothesis among those skilled in the art that if a Rene is amplified in cancer, the encoded 
protein is likely to be expressed at an elevated level one skilled in the art would simply accept 
that since the PROl 1 1 1 gene is amplified, the PROl 1 1 1 polypeptide would be more likely than 
not over-expressed. Thus, data relating to PROl 1 1 1 polypeptide expression may be used for the 
same diagnostic and prognostic purposes as data relating to PROl 1 1 1 gene expression. 
Therefore, based on the disclosure in the specification, no further research would be necessary to 
determine how to use the claimed PRO 1111 polypeptides, because the current invention is fully 
enabled by the disclosure of the present application. 

With respect to the Examiner's assertion that the gene amplification is due to aneuploidy, 
Appellants submit that, even if it were, it is known in the art that detection of gene amplification 
can be used for cancer diagnosis regardless of whether the increase in gene copy number results 
from intrachromosomal changes or from chromosomal aneuploidy. As explained by Dr. 

!9 i n re Alton, 37 U.S.P.Q.2d 1578 (Fed. Cir 1966) at 1584 quoting In re Oetiker, 977 
F.2d 1443, 1445, 24 U.S.P.Q.2d 1443, 1444 (Fed. Cir. 1992)). 

20 j n re Alton, supra. 
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Ashkenazi in his Declaration (submitted with Appellants' Amendment and Response filed on 
November 8, 2004), 

An increase in gene copy number can result not only from intrachromosomal 
changes but also from chromosomal aneuploidy. It is important to understand 
that detection of gene amplification can be used for cancer diagnosis even if the 
determination includes measurement of chromosomal aneuploidy. Indeed, as 
long as a significant difference relative to normal tissue is detected, it is irrelevant 
if the signal originates from an increase in the number of gene copies per 
chromosome and/or an abnormal number of chromosomes. 

Hence, Appellants submit that gene amplification of a gene, whether by aneuploidy or 
any other mechanism, is useful as a diagnostic marker. 

Moreover, it appears that the Examiner's concern is with regard to the underlying 
mechanism of gene amplification, and not with the positive results themselves. However, the 
Examiner's concerns regarding the mechanism of PRO 1 1 1 1 gene amplification associated with 
any type of cancer versus normal tissue, should in no way negate the utility of the claimed 
invention. The fact remains that the gene amplification results demonstrate overexpression of 
PROl 1 1 1 in the named tumor. One of ordinary skilled in the art does not need to know the 
underlying mechanism of the overexpression of PROl 111, whether aneuploidy, mutation or 
translocation, to practice the claimed invention. One of ordinary skill in the art, in possession of 
these results, would have believed it more likely than not that the PROl 1 1 1 polypeptides are 
useful for their asserted utility. Therefore, this rejection is not proper. 

The Examiner has also alleged, based on the reference by Sen et ah, that the observed 
gene amplification was not corrected for aneuploidy. (Office Action mailed July 6, 2004) 

* 

Appellants respectfully disagree and submit that their gene amplification data was not 
due to aneuploidy. As stated above, Appellants had submitted the Ashkenazi Declaration to 
show that "detection of gene amplification can be used for cancer diagnosis even if the 
determination includes measurement of chromosomal aneuploidy." Regarding Sen et ai, 
Appellants agree that while aneuploidy can be a feature of damaged tissue as well, besides 
cancerous or pre-cancerous tissue, and may not invariably lead to cancer, Sen et al in fact 
support the Appellants' position that PROl 1 1 1 is still useful in diagnosing pre-cancerous lesions 
or cancer itself. For instance, the art in lung cancer at the time of filing of the instant application 
clearly described that "epithelial tumors develop through a multistep process driven by genetic 
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instability" in damaged colon lesions which may eventually lead to lung cancer. Many articles 
published around the effective filing date of this application studied such damaged or 
premalignant lesions and suggested that identification of such pre-cancerous lesions were very 
important in preventive diagnosis and treatment of lung cancer. Based on the well-known art, 
Appellants submit that there is utility in identifying genetic biomarkers in epithelial tissues at 
cancer risk. 

The Examiner has further asserted that "it has not been established that the protein of 
SEQ ID NO: 229 is a diagnostic, ... applicants have found a single nucleic acid, SEQ ID 
NO: 228, that is found to be aneuploid in a small number of tumor cell lines, " (Page 9 of the 
Final Office Action mailed December 26, 2006; emphasis added). 

Appellants respectfully point out that they have shown significant DNA amplification in 
eleven different lung and colon tumor samples. The fact that not all lung tumors tested positive 
in this study does not make the gene amplification data less significant. As any skilled artisan in 
the field of oncology would easily appreciate, not all tumor markers are generally associated 
with every tumor, or even with most tumors . For example, the article by Hanna and Mornin 
(submitted with the Response filed November 8, 2004), discloses that the known breast cancer 
marker HER-2/neu is "amplified and/or overexpressed in 10%-30% of invasive breast cancers 
and in 40%-60% of intraductal breast carcinoma" (page 1, col. 1). 

Appellants submit that the amplification of the PROl 1 1 1 nucleic acids in even one lung 
or colon tumor provides specific and substantial utility for the nucleic acid as a diagnostic 
marker of the type of lung or colon tumor in which it was amplified. Appellants further note that 
the tumors listed in Table 8 are not similar tumors from different patients, but various 
types/classes of lung and/or colon tumors at different stages. Accordingly, a positive result from 
one tumor, where the nucleic acid was amplified, but not from other tumors, indicates that the 
nucleic acid can be used as a marker for diagnosing the presence of that kind of tumor in which it 
was amplified. Amplification of the nucleic acid would be indicative of that specific class of 
lung or colon tumor, whereas absence of amplification would be non-conclusive. The skilled 
artisan would certainly know that such tumor markers are useful for better classification of 
tumors. Therefore, whether the PROl 1 1 1 gene is amplified in eleven lung and colon tumors or 
in all lung and colon tumors is not relevant to its identification as a tumor marker, or its 
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patentable utility. Rather, the fact that the amplification data for PROl 1 1 1 is considered 
significant is what lends support to its usefulness as a tumor marker. If the goal is to diagnose 
lung cancer, then contrary to the Examiner's assertion, a positive result does indicate the 
presence of cancer, while a negative result is not conclusive, and requires follow up testing. 

C. A prima facie case of lack of utility has not been established 

The Examiner has asserted, based on Pennica et al , Konopka et al , Haynes et al , Hu et 
al , Godbout et al and Li et al that there is a general lack of correlation between gene 
amplification and mRNA expression and, thus, while the data in Table 9 may provide a basis for 
utility and enablement of PROl 1 1 1 nucleic acid, it does not provide a basis for utility or 
enablement of the claimed polypeptides (Office Actions mailed July 6, 2004; May 10, 2005; 
December 21,2006). 

As a preliminary matter, Appellants respectfully submit that it is not a legal requirement 
to establish that gene amplification "necessarily" results in increased expression at the mRNA 
and polypeptide levels or that polypeptide levels can be "accurately predicted." As discussed 
above, the evidentiary standard to be used throughout ex parte examination of a patent 
application is a preponderance of the totality of the evidence under consideration. Accordingly, 
Appellants submit that in order to overcome the presumption of truth that an assertion of utility 
by the applicant enjoys, the Examiner must establish that it is more likely than not that one of 
ordinary skill in the art would doubt the truth of the statement of utility. Therefore, it is not 
legally required that there be a "necessary" correlation between the data presented and the 
claimed subject matter. The law requires only that one skilled in the art should accept that such a 
correlation is more likely than not to exist . Appellants respectfully submit that when the proper 
evidentiary standard is applied, a correlation must be acknowledged. 

Pennica et al. 

Appellants submit that Pennica et al. does not show a lack of correlation between gene 
(DNA) amplification and mRNA levels. According to the quoted statement from Pennica et al, 
"WISP-1 gene amplification in human lung tumors showed a correlation between DNA 
amplification and over-expression, whereas overexpression of WISP-3 RNA was seen in the 
absence of DNA amplification. In contrast, WISP-2 DNA was amplified in lung tumors, but its 
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mRNA expression was significantly reduced in the majority of tumors compared with expression 
in normal colonic mucosa from the same patient." From this, the Examiner correctly concludes 
that increased copy number does not necessarily result in increased polypeptide expression. The 
standard, however, is not absolute certainty. The fact that in the case of a specific class of 
closely related molecules there seemed to be no correlation with gene amplification and the level 
of mRNA/protein expression, does not establish that it is more likely than not, in general, that 
such correlation does not exist. The Examiner has not shown whether the lack or correlation 
observed for the family of WISP polypeptides is typical, or is merely a discrepancy, an exception 
to the rule of correlation . Indeed, the working hypothesis among those skilled in the art is that, if 
a gene is amplified in cancer, the encoded protein is likely to be expressed at an elevated level. 
In fact, as noted even in Pennica et al., "[a]n analysis of WISP-l gene amplification and 
expression in human lung tumors showed a correlation between DNA amplification and over- 
expression . . . (Pennica et al. 9 pagel4722, left column, first full paragraph, emphasis added). 

Accordingly, Appellants respectfully submit that Pennica et al. teaches nothing 
conclusive regarding the absence of correlation between amplification of a gene and over- 
expression of the encoded WISP polypeptide. More importantly, the teaching of Pennica et al. is 
specific to WISP genes. Pennica et al. has no teaching whatsoever about the correlation of gene 
amplification and protein expression in general . 

Konopka et al. 

Regarding Konopka et al., Appellants submit that the Examiner has completely 

misinterpreted the teachings in the cited reference. Contrary to the Examiner's assertions, 

Konopka et al. does not support the position that DNA amplification is not correlated with 

mRNA overexpression . Konopka et al. show only that, of the cell lines known to have increased 

abl protein expression, only one had amplification of the abl gene (page 4051, col. 1). This 

result proves only that increased mRNA and protein expression levels can result from causes 

other than gene amplification. Konopka et al. do not demonstrate that when gene amplification 

does occur, it does not result in increased mRNA and protein expression levels, particularly 

given that the cell line with amplification of the abl gene did show increased abl mRNA and 

protein expression levels. Furthermore, Konopka et al. supports Appellants' position that mRNA 

levels correlate with protein levels. Konopka et al. state that "the 8-kb mRNA that encodes 
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P210 c " abl was detected at a 10-fold higher level in SK-CML7bt-333 ( Fig. 3 A, +) than in SK- 
CML 1 6Bt- 1 (B, +), which correlated with the relative level of P210 c " abl detected in each cell 
line. Analysis of additional cell lines demonstrated that the level of 8-kb mRNA directly 
correlated with the level of P210 c " abl (Table 1)" (page 4050, col. 2, emphasis added). 

Haynes et aL 

The Examiner has cited Haynes et aL as allegedly providing evidence that "polypeptide 
levels cannot be accurately predicted from the level of the corresponding mRNA transcript." 
(Page 5 of the Office Action mailed March 31, 2005). 

As discussed previously, the law does not require the existence of a strong or linear 
correlation between mRNA and protein levels. Nor does the law require that protein levels be 
"accurately" predicted. According to the authors themselves, the Haynes data confirm that there 
is a "general trend" between protein expression and transcript levels (page 1863, col. 1), which 
meets the "more likely than not standard" and shows that a positive correlation exists between 
mRNA and protein. For example, in Figure 1 , there is a positive correlation between mRNA and 
protein levels amongst most of the 80 yeast proteins studied. In fact, very few data points 
deviated or scattered away from the expected normal and no data points showed a negative 
correlation between mRNA and protein levels (i.e. an increase in mRNA resulted in a decrease in 
protein levels). The analysis by Haynes et aL is not relevant to the current application. Haynes 
et aL studied yeast cells and not human cells. Haynes et aL note that their analysis focused on 
the 80 most abundant proteins in the yeast lysate (page 1867). Haynes et aL state "since many 
important regulatory protein are present only at low abundance, these would not be amenable to 
analysis" (page 1867). Further, Haynes et aL compared the protein expression levels of these 
naturally abundant proteins to mRNA expression levels from published SAGE frequency tables 
(page 1863) Accordingly, Haynes et aL did not compare mRNA expression levels and protein 
levels in the same yeast cells. Thus the analysis by Haynes et aL is not applicable to the present 
application. 

Hu et aL 

The Examiner has further cited Hu et aL, in support of the assertion that "the literature 
cautions researchers from drawing conclusions based on small changes in transcript expression 
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levels between normal and cancerous tissue." (Page 4 of the Office Action mailed March 31, 
2005). 

Appellants submit that in order to overcome the presumption of truth that an assertion of 
utility by the applicant enjoys, the Examiner must establish that it is more likely than not that one 
of ordinary skill in the art would doubt the truth of the statement of utility. Accordingly, 
contrary to the Examiner's assertion, Appellants submit that Hu et al does not conclusively show 
that it is more likely than not that gene amplification does not result in increased expression at 
the mRNA and polypeptide levels. First, the title of Hu et al is "Analysis of Genomic and 
Proteomic Data Using Advanced Literature Mining." As the title clearly suggests, the 
conclusion suggested by Hu et al is merely based on a statistical analysis of the information 
disclosed in the published literature. As Hu et al states, "We have utilized a computational 
approach to literature mining to produce a comprehensive set of gene-disease relationships." In 
particular, Hu et al relied on the MedGene Database and the Medical Subject Heading (MeSH) 
files to analyze the gene-disease relationship. More specifically, Hu et al "compared the 
MedGene breast cancer gene list to a gene expression data set generated from a micro-array 
analysis comparing breast cancer and normal breast tissue samples." (See page 408, right 
column). 

Therefore, Appellants first submit that the reference by Hu et al only studies the 
statistical analysis of micro-array data and not gene amplification data. Therefore, their findings 
would not be directly applicable to gene amplification data. In addition, Appellants respectfully 
submit that the Hu et al reference does not show a lack of correlation between microarray data 
and the biological significance of cancer genes is typical. 

According to Hu et al, "different statistical methods" were applied to "estimate the 
strength of gene-disease relationships and evaluated the results." (See page 406, left column, 
emphasis added). Using these different statistical methods, Hu et al "[a]ssessed the relative 
strengths of gene-disease relationships based on the frequency of both co-citation and single 
citation." (See page 411, left column). It is well known in the art that various statistical methods 
allow different variables to be manipulated to affect the outcome. For example, the authors 
admit, "Initial attempts to search the literature using" the list of genes, gene names, gene 
symbols, and frequently used synonyms, generated by the authors "revealed several sources of 
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false positives and false negatives." (See page 406, right column). The authors further admit that 
the false positives caused by "duplicative and unrelated meanings for the term" were "difficult to 
manage." Therefore, in order to minimize such false positives, Hu et aL disclose that these terms 
"had to be eliminated entirely, thereby reducing the false positive rate but unavoidably under- 
representing some genes. " 2 1 Hence, Appellants respectfully submit that in order to minimize 
the false positives and negatives in their analysis, Hu et al manipulated various aspects of the 
input data. 

Appellants further submit that the statistical analysis by Hu et al is not a reliable standard 
because the frequency of citation reflects only the current research interest of a molecule rather 
than the true biological function of the molecule. Indeed, the authors acknowledge that 
"[relationship established by frequency of co-citation do not necessarily represent a true 
biological link." (See page 411, right column). It often happens in scientific study that important 
molecules are overlooked by the scientific society for many years until the discovery of their true 
function. Therefore, Appellants submit that Hu et al drew their conclusion based on a very 
unreliable standard and that their research does not provide any meaningful information 
regarding the correlation between microarray data and the biological significance of a molecule. 

Even assuming that Hu et al provide evidence to support a true relationship, the 
conclusion in Hu et al only applies to a specific type of breast tumor (estrogen receptor (Ex- 
positive breast tumor) and can not be generalized as a principle governing microarray study of 
breast cancer in general, let alone the various other types of cancer genes in general . 
In fact, even Hu et al admit that, M [i]t is likely that this threshold will change depending on the 
disease as well as the experiment. Interestingly, the observed correlation was only found among 
ER-positive (breast) tumors not ER-negative tumors," (See page 412, left column). Therefore, 
based on these findings, the authors add, "This may reflect a bias in the literature to study the 
more prevalent type of tumor in the population. Furthermore, this emphasizes that caution must 
be taken when interpreting experiments that may contain subpopulations that behave very 

differently." 22 

21 id 

22 id. (emphasis added). 
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Godbout et al. 

Regarding Godbout, the Examiner has asserted that Godbout et al. teaches that "a number 
of studies suggest that co-amplified genes are only overexpressed if they provide a selective 
advantage to the cells in which they are amplified." The Examiner further asserts that Godbout 
teaches "[i]t is generally accepted that co-amplified genes are not over-expressed unless they 
provide a selective growth advantage to the cell." (Pages 7-8 of the Final Office Action mailed 
December 26, 2006). 

Appellants have previously made of record three more recent references, published in 
2002, by Orntoft et al, Hyman et al, and Pollack et al, (made of record in Appellants' Response 
filed on November 8, 2004), which collectively teach that in general, gene amplification 
increases mRNA expression . Appellants submit that these more recent references must be 
acknowledged as more accurately reflecting the state of the art regarding the correlation between 
gene amplification and transcript expression than the references cited by Godbout et al. 

Appellants further maintain that Godbout et al. report that "there is a good correlation 
with DDX1 gene copy number, DDX1 transcript levels, and DDX1 protein levels in all cell lines 
studied." Thus, in these cancer cell lines, DDX1 mRNA and protein levels are correlated. 

Moreover, selective advantage to cell survival is not the only mechanism by which genes 
impact cancer. Mechanistic data is not a requirement for the utility requirement. Hence, this 
rejection is improper. Appellants respectfully submit that, as discussed above, Orntoft et al, 
Hyman et al, and Pollack et al, (of record), collectively teach that gene amplification increases 
mRNA expression for large numbers of genes, which have not been identified as being 
oncogenes or as conferring any selective growth advantage on tumor cells. Thus, the art of 
record clearly shows that there is no requirement that a polypeptide must be a known oncogene 
or a protein otherwise known to be associated with tumor growth, in order for amplification of 
the gene encoding the protein to correlate with increased protein expression. In fact, as 
demonstrated by Orntoft et al , Hyman et al. , and Pollack et al , examination of gene 
amplification is a useful way to identify novel proteins not previously known to be associated 
with cancer. 

The Examiner has asserted that unlike Godbout et aL, the instant specification does not 
teach structure/ function analysis and the Examiner questions whether the level of genomic 
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amplification of DDX1 gene is comparable to that disclosed by PROl 111. (Page 8of the Final 
Office Action mailed December 26, 2006) 

Appellants respectfully submit that it was never claimed that PROl 1 1 1 is similar in any 
way to the DDX1 gene of Godbout et al } they never claimed PROl 1 1 1 was an RNA helicase or 
that it confers selective advantage to cell survival; on the other hand, the Godbout reference was 
submitted to show good correlation between protein levels based upon genomic DNA 
amplification, which the Examiner clearly agrees with. Moreover, selective advantage to cell 
survival is not the only mechanism by which genes impact cancer. Structure/function data, 
which the Examiner requests, is not a requirement for the utility requirement. Hence this 
rejection is improper. 

Li et al. 

The Examiner has cited Li et al. as teaching that "68.8% of the genes showing over- 
representation in the genome did not show elevated transcript levels." (Page 8 of the Office 
Action mailed in December 26, 2006) 

Appellants respectfully point out that Li et al. acknowledge that their results differed 

from those obtained by Hyman et al. and Pollack et al. (of record), who found a substantially 

higher level of correlation between gene amplification and increased gene expression. The 

authors note that "[t]his discordance may reflect methodologic differences between studies or 

biological differences between breast cancer and lung adenocarcinoma" (page 2629, col. 1). For 

instance, as explained in the Supplemental Information accompanying the Li article, genes were 

considered to be amplified if they had a copy number ratio of at least 1.40 . In the case of 

PROl 1 1 1, as discussed in previously filed responses and in the Goddard Declaration (of record), 

an appropriate threshold for considering gene amplification to be significant is a copy number of 

at least 2.0 (which is a higher threshold). The PROl 1 1 1 gene showed significant amplification 

of 2.07-fold to 2.99-fold in eleven different lung and colon primary tumors , and thus fully meets 

this standard. It is not surprising that in the Li et al reference, by using a lower threshold of 1 .4 

for considering gene amplification, a higher number of genes not showing corresponding 

increases in mRNA expression were found. Moreover, Appellants add that the results of Li et al. 

do not conclusively disprove that a gene with a substantially higher level of gene amplification. 

such as PRO 1111 , would be expected to show a corresponding increase in transcript expression. 
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The Patent Office has failed to meet its initial burden of proof that Appellants' claims of 
utility are not substantial or credible. The arguments presented by the Examiner in combination 
with the Sen et al., Pennica et al, Konopka et ah, Haynes et aL, Hu et al, Godbout et al and Li 
et al. articles do not provide sufficient reasons to doubt the statements by Appellants that 
PROl 1 1 1 has utility. As discussed above, the law does not require that gene amplification 
"necessarily" results in increased expression at the mRNA and polypeptide levels. Therefore, 
Appellants submit that the Examiner's reasoning is based on a misrepresentation of the scientific 
data presented in the above cited references and application of an improper, heightened legal 
standard. In fact, contrary to what the Examiner contends, the art indicates that, if a gene is 
amplified in cancer, it is more likely than not that the encoded protein will be expressed at an 
elevated level. 

D. It is "more likely than not" for amplified genes to have increased mRNA 

On the contrary, Appellants submit that Example 170 of the specification further 
discloses that, "(amplification is associated with overexpression of the gene product, indicating 
that the polypeptides are useful targets for therapeutic intervention in certain cancers such as 
lung, colon, breast and other cancers and diagnostic determination of the presence of those 
cancers" (Emphasis added). Besides, Appellants have submitted ample evidence to show that, in 
general, if a gene is amplified in cancer, it is "more likely than not" that the corresponding 
mRNA will also be expressed at an elevated level. 

For instance, Appellants presented the articles by Orntoft et al , Hyman et al , and 
Pollack et al. (made of record in Appellants' Response filed November 8, 2004), who 
collectively teach that in general, for most genes, DNA amplification increases mRNA 
expression . Second, Appellants have submitted over one hundred references, along with 
Declarations by Dr. Paul Polakis with their Response of September 29, 2006 and November 8, 
2004, which collectively teach that, in general there is a correlation between mRNA levels and 
polypeptide levels . Taken together, although there are some examples in the scientific art that do 
not fit within the central dogma of molecular biology that there is a correlation between 
polypeptide and mRNA levels, these instances are exceptions rather than the rule. In the 
majority of amplified genes , the teachings in the art, as exemplified by Orntoft et al, Hyman et 
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ah } Pollack et ah, and the Polakis Declarations, overwhelmingly show that gene amplification 
influences gene expression at the mRNA and protein levels. Therefore, one of skill in the art 
would reasonably expect in this instance, based on the amplification data for the PROl 1 1 1 gene, 
that the PROl 1 1 1 polypeptide is concomitantly overexpressed. Thus, Appellants submit that the 
claimed PROl 1111 polypeptides have utility in the diagnosis of cancer. 

Orntoft et ah, Hyman et ah, and Pollack et ah 

The results presented by Orntoft et ah , Hyman et ah , and Pollack et ah are based upon 
wide ranging analyses of a large number of tumor associated genes. Orntoft et ah studied 
transcript levels of 5600 genes in malignant bladder cancers, many of which were linked to the 
gain or loss of chromosomal material, and found that in general (18 of 23 cases) chromosomal 
areas with more than 2-fold gain of DNA showed a corresponding increase in mRNA transcripts. 
Hyman et ah compared DNA copy numbers and mRNA expression of over 12,000 genes in 
breast cancer tumors and cell lines, and found that there was evidence of a prominent global 
influence of copy number changes on gene expression levels. In Pollack et ah, the authors 
profiled DNA copy number alteration across 6,691 mapped human genes in 44 predominantly 
advanced primary breast tumors and 10 breast cancer cell lines, and found that on average, a 2- 
fold change in DNA copy number was associated with a corresponding 1 .5 -fold change in 
mRNA levels. Thus, these articles collectively teach that in general, gene amplification 
increases mRNA expression . 

The Examiner appears to disregard the ample evidence provided in the above referenced 
articles based on misinterpretations of their teachings. Appellants submit that in fact, these 
articles lend significant support that for an amplified gene, it is more likely than not that the 
protein will also be overexpressed and would be viewed as reasonable and credible by one of 
ordinary skill in the art. The "more likely than not" standard is a much lower standard than a 
"necessary" correlation or "accurate" prediction, and is clearly met in the claimed invention. 
Moreover, the Examiner has not cited any evidence or advanced any arguments as to why 
Appellants' statement of overexpression of protein would not be credible. Accordingly, this 
point is believed to be moot. 

The Examiner has asserted that "Orntoft et al. could only compare the levels of about 40 

well-resolved and focused abundant proteins." (Page 6 of the Office Action mailed March 31, 
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2006; emphasis in original). While technical considerations did prevent Orntoft et al. from 
evaluating a larger number of proteins, the ones they did look at showed a clear correlation 
between mRNA and protein expression levels. As Orntoft et al. states, "In general there was a 
highly significant correlation (p<0.005) between mRNA and protein alterations.. . . 26 well 
focused proteins whose genes had a known chromosomal location were detected in TCCs 733 
and 335, and of these 19 correlated (p<0.005) with the mRNA changes detected using the 
arrays." (See page 42, column 2 to page 34, column 2). Accordingly, Orntoft et al. clearly 
support Appellants' position that proteins expressed by genes that are amplified in tumors are 
useful as cancer markers. 

The Examiner has further asserted that "applicants have provided no fact or evidence 
concerning a correlation between such low levels of amplification of DNA, found in only a 
minority of tested tumors which were not characterized on the basis of those in the Orntoft 
publication, and an associated rise in level of the encoded protein." (Page 6 of the Office Action 
mailed March 31, 2006). As discussed above, the levels of amplification for PROl 1 1 1 were not 
"low" but significant, and ranged from 2.07-fold to 2.99-fold, in eleven different lung and colon 
tumors. Appellants note that the levels of gene amplification observed by Orntoft et al. were 
relatively low, averaging only 0.3-0.4-fold (page 40, col. 1). In particular, the level of gene 
amplification associated with expression changes was only around two-fold (see Figure 2), even 
less than the 2.07-fold to 2.99-fold amplification observed for PROl 111. Even with these 
relatively low levels of gene amplification, Orntoft et al. found that "[i]n most cases, 
chromosomal gains detected by CGH were accompanied by an increased level of transcripts in 
both TCCs 733 (77%) and 827 (80%)" (page 40, col. 2; emphasis added). The level of 
correlation between DNA copy number and increased mRNA levels observed by Orntoft et al., 
from 77-80%, clearly meets the standard of more likely than not. Orntoft et al. also found a 
"highly significant" correlation between mRNA and protein levels, with the two data sets studied 
having correlations of 39/40 (98%) and 19/26 (73%) (pages 42-43). 

Appellants respectfully submit that the Examiner also appears to misunderstand the data 
presented by Hyman et al. The Examiner asserts that "of the 12,000 transcripts analyzed, a set of 
270 was identified in which overexpression was attributable to gene amplification." The 
Examiner concludes that "[t]his proportion is 2%; the Examiner maintains that 2% does not 
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provide a reasonable expectation that the slight amplification of SEQ ID NO: 206 would be 
correlated with elevated levels of mRNA." (Page 6 of the Office Action mailed March 31, 
2006). Appellants respectfully submit that the Examiner appears to have misinterpreted the 
results of Hyman et al. Hyman et al. chose to do a genome-wide analysis of a large number of 
genes, most of which, as shown in Figure 2, were not amplified. Accordingly, the 2% number is 
meaningless, as the low figure mainly results from the fact that only a small percentage of genes 
are amplified in the first place. The significant figure is not the percentage of genes in the 
genome that show amplification, but the percentage of amplified genes that demonstrate 
increased mRNA and protein expression. 

The Examiner has further asserted that the Hyman reference "found 44% of highly 
amplified genes showing overexpression at the mRNA level, and 10.5% of highly overexpressed 
genes being amplified; thus, even at the level of high amplification and high overexpression, the 
two do not correlate." (Page 6 of the Office Action mailed March 31, 2006). Appellants submit 
that the 10.5% figure is not relevant to the issue at hand. One of skill in the art would understand 
that there can be more than one cause of overexpression. The issue is not whether 
overexpression is always, or even typically caused by gene amplification, but rather, whether 
gene amplification typically leads to overexpression. 

The Examiner's assertion is not consistent with the interpretation Hyman et al. 
themselves place on their data, stating that, "The results illustrate a considerable influence of 
copy number on gene expression patterns." (page 6242. col. 1; emphasis added). In the more 
detailed discussion of their results, Hyman et al. teach that "[u]p to 44% of the highly amplified 
transcripts (CGH ratio, >2.5) were overexpressed (i.e., belonged to the global upper 7% of 
expression ratios) compared with only 6% for genes with normal copy number." (See page 
6242, col. 1; emphasis added). These details make it clear that Hyman et al. set a highly 
restrictive standard for considering a gene to be overexpressed; yet almost half of all highly 
amplified transcripts met even this highly restrictive standard. Therefore, the analysis performed 
by Hyman et al. clearly shows that "it is more likely than not" that a gene which is amplified in 
tumor cells will have increased gene expression. 

The Examiner has also asserted that neither Hyman et al. nor Pollack et al. examines 
protein expression. (Page 6 of the Office Action mailed March 31, 2006). 
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Appellants respectfully submit that the Orntoft et al. ? Hyman et al. and Pollack et al. 
references were submitted primarily as evidence that in general, gene amplification increases 
mRNA expression. With regard to the correlation between mRNA expression and protein levels, 
Appellants previously submitted a Declaration by Dr. Polakis, principal investigator of the 
Tumor Antigen Project of Genentech, Inc., the assignee of the present application, to show that 
mRNA expression correlates well with protein levels , in general. 

Polakis Declarations 

In addition, in their Response filed September 29, 2006 and November 8, 2004, 

Appellants submitted two Declaration by Dr. Polakis, principal investigator of the Tumor 

Antigen Project of Genentech, Inc., the assignee of the present application, to show that mRNA 

expression correlates well with protein levels, in general. As Dr. Polakis explains, the primary 

focus of the microarray project was to identify tumor cell markers useful as targets for both the 

diagnosis and treatment of cancer in humans. The Declaration by Dr. Paul Polakis (Polakis I - 

made of record in Appellants' Response filed July 19, 2004) explains that in the course of Dr. 

Polakis' research using microarray analysis, he and his co-workers identified approximately 200 

gene transcripts that are present in human tumor cells at significantly higher levels than in 

corresponding normal human cells. Appellants submit that Dr. Polakis' Declaration was 

presented to support the position that there is a correlation between mRNA levels and 

polypeptide levels. The second Declaration by Dr. Polakis (Polakis II- - made of record in 

Appellants 1 Preliminary Amendment filed September 12, 2006) presented evidentiary data in 

Exhibit B. Exhibit B of the Declaration identified 28 gene transcripts out of 3 1 gene transcripts 

(i.e., greater than 90%) that showed good correlation between tumor mRNA and tumor protein 

levels. As Dr. Polakis 5 Declaration (Polakis II) says "[a]s such, in the cases where we have been 

able to quantitatively measure both (i) mRNA and (ii) protein levels in both (i) tumor tissue and 

(ii) normal tissue, we have observed that in the vast majority of cases, there is a very strong 

correlation between increases in mRNA expression and increases in the level of protein encoded 

by that mRNA." Accordingly, Dr. Polakis has provided the facts to enable the Examiner to draw 

independent conclusions regarding protein data. Appellants further emphasize that the opinions 

expressed in the Polakis Declaration, including in the above quoted statement, are all based on 

factual findings. For instance, antibodies binding to about 30 of these tumor antigens were 
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prepared and mRNA and protein levels were compared. In approximately 80% of the cases, the 
researchers found that increases in the level of a particular mRNA correlated with changes in the 
level of protein expressed from that mRNA when human tumor cells are compared with their 
corresponding normal cells. Therefore, Dr. Polakis* research, which is referenced in his 

■ 

Declaration, shows that, in general, there is a correlation between increased mRNA and 
polypeptide levels. Hence, one of skill in the art would reasonably expect that, based on the gene 
amplification data of the PR0853 gene, the PR0853 polypeptide is concomitantly overexpressed 
in lung and colon tumors studied as well. Based on these experimental data and his vast 
scientific experience of more than 20 years, Dr. Polakis states that, for human genes, increased 
mRNA levels typically correlate with an increase in abundance of the encoded protein. He 
further confirms that "it remains a central dogma in molecular biology that increased mRNA 
levels are predictive of corresponding increased levels of the encoded protein." 

With regard to the correlation between mRNA expression and protein levels, the 
Examiner has recently acknowledged that "the predictability of protein on the basis of mRNA is 
not at issue in this case, which deals with the amplification of genomic DNA, not mRNA." (Page 
3 of the Office Action mailed April 7, 2008) Appellants submit that presentation of the Polakis 
Declarations are nevertheless relevant in this application based on the gene amplification utility, 
because it forms a critical piece of evidence in this case. When placed together with the entire 
evidence presented for PROl 111, one would logically come to the conclusion that, it is more 
likely than not, that increased DNA levels generally correlate well with increased mRNA levels 
(based on, for example, the teachings of supportive references like Orntoft et al., Hyman et al., 
Pollack et al., Bea et al., Godbout et al., etc.), and further, increased mRNA levels generally 
correlate well with increased protein levels (the two Polakis Declarations, and over 100 
supporting references). In summary, Appellants have presented multiple pieces of evidence, 
such as the Goddard Declaration, the Ashkenazi Declaration, two Polakis Declarations, and 
several references addressing the relationship between DNA and mRNA/ protein levels, etc., 
each of which is critical evidence that supports Appellants' position that PROl 1 1 1 polypeptides 
have utility based on the gene amplification results. Therefore, Appellants believe that a sound 
case has been presented for utility of PROl 111 as a diagnostic marker, based on the gene 
amplification data of its corresponding gene in the specification. 
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In summary, the evidence supports the Appellants' position that gene amplificationis 
more likely than not predictive of increased mRNA and polypeptide levels. 

Even if a prima facie case of lack of utility has been established, it should be 
withdrawn on consideration of the totality of evidence 

Even if one assumes arguendo that it is more likely than not that there is no correlation 

between gene amplification and increased mRNA/protein expression, which Appellants submit is 

not true, a polypeptide encoded by a gene that is amplified in cancer would still have a specific, 

substantial, and credible utility. In support, Appellants respectfully draw the Board's attention to 

page 2 of the Declaration of Dr. Avi Ashkenazi (submitted with the Response filed November 8, 

2004) which explains that, 

even when amplification of a cancer marker gene does not result in significant 
over-expression of the corresponding gene product, this very absence of gene 
product over-expression still provides significant information for cancer diagnosis 
and treatment. Thus, if over-expression of the gene product does not parallel gene 
amplification in certain tumor types but does so in others, then parallel monitoring 
of gene amplification and gene product over-expression enables more accurate 
tumor classification and hence better determination of suitable therapy. In 
addition, absence of over-expression is crucial information for the practicing 
clinician. If a gene is amplified but the corresponding gene product is not over- 
expressed, the clinician accordingly will decide not to treat a patient with agents 
that target that gene product. 

Appellants thus submit that simultaneous testing of gene amplification and gene product 
over-expression enables more accurate tumor classification, even if the gene-product, the protein, 
is not over-expressed. This leads to better determination of a suitable therapy. Further, as 
explained in Dr. Ashkenazi' s Declaration, absence of over-expression of the protein itself is 
crucial information for the practicing clinician. If a gene is amplified in a tumor, but the 
corresponding gene product is not over-expressed, the clinician will decide not to treat a patient 
with agents that target that gene product. This not only saves money, but also has the benefit that 
the patient can avoid exposure to the side effects associated with such agents. 

This utility is further supported by the teachings of the article by Hanna and Mornin. 
(Pathology Associates Medical Laboratories, August (1999), submitted with the Response filed 
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July 20, 2004). The article teaches that the HER-2/neu gene has been shown to be amplified 
and/or over-expressed in 10%-30% of invasive breast cancers and in 40%-60% of intraductal 
breast carcinomas. Further, the article teaches that diagnosis of breast cancer includes testing 
both the amplification of the HER-2/neu gene (by FISH) as well as the over-expression of the 
HER-2/neu gene product (by IHC). Even when the protein is not over-expressed, the assay 
relying on both tests leads to a more accurate classification of the cancer and a more effective 
treatment of it. 

Appellants have clearly shown that the gene encoding the PROl 1 1 1 polypeptide is 
amplified in at least eleven different lung and colon tumors. Therefore, the PROl 1 1 1 gene, 
similar to the HER-2/neu gene disclosed in Hanna et al., is a tumor associated gene. 
Furthermore, as discussed above, in the majority of amplified genes, the teachings in the art 
overwhelmingly show that gene amplification influences gene expression at the mRNA and 
protein levels. Therefore, one of skill in the art would reasonably expect in this instance, based 
on the amplification data for the PROl 1 1 1 gene, that the PROl 1 1 1 polypeptide is concomitantly 
overexpressed. 

Thus, based on the asserted utility for PROl 1 1 1 in the diagnosis of lung and colon 
tumors, the reduction to practice of the instantly claimed protein sequence of SEQ ID NO:229 in 
the present application, the disclosure of the step-by-step protocols for making chimeric PRO 
polypeptides, including those wherein the heterologous polypeptide is an epitope tag or an Fc 
region of an immunoglobulin in the specification (at page 374, lines 24 to page 375, line 9), the 
disclosure of a step-by-step protocol for making and expressing PROl 1 1 1 in appropriate host 
cells (in Examples 140-143 and page 376, line 12), the step-by-step protocol for the preparation, 
isolation and detection of monoclonal, polyclonal and other types of antibodies against the 
PROl 1 1 1 protein in the specification (at pages 390-395) and the disclosure of the gene 
amplification assay in Example 170, the skilled artisan would know exactly how to make and use 
the claimed polypeptide for the diagnosis of lung and colon cancers. Appellants submit that 
based on the detailed information presented in the specification and the advanced state of the art 
in oncology, the skilled artisan would have found such testing routine and not 'undue'. 
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Therefore, Appellants respectfully request reconsideration and reversal of this 
outstanding rejections under 35 U.S.C. §101 and §112, first paragraph, to Claims 124-126 and 
129-133. 

ISSUE 3: Claims 132-133 satisfy the enablement requirement of 35 USC §112, first 
paragraph. 

Claims 132-133 are rejected under 35 U.S.C. §112, first paragraph, because allegedly 
"the specification, while being enabling for the protein of SEQ ID NO: 229 or fragments thereof 
for making antibodies or having chondrocyte ^differentiation activity, does not reasonably 
provide enablement for proteins that are encoded by a nucleic acid that is amplified in 
adenocarcinomas or squamous cell carcinomas of the lung or in adenocarcinomas of the colon." 
(Page 4 of the Final Office Action mailed April 7, 2008). 

Appellants submit that they rely on the 'gene amplification' assay (Example 1 70) not 
chondrocyte ^differentiation assay for patentable utility of the instantly claimed subject matter. 
The teachings of the specification should be evaluated through the eyes of one skilled in the 
pertinent art at the effective filing date of June 23, 1999 of the present application. As the 
M.P.E.P. states, n [t]he fact that experimentation may be complex does not necessarily make it 
undue, if the art typically engages in such experimentation." 23 Further, a considerable amount of 
experimentation is permissible, if it is merely routine. 

Claims 132-133 recite an isolated polypeptide comprising an amino acid sequence having 
at least 95% and 99% sequence identity, respectively, to the amino acid sequence of SEQ ID 
NO:229 and with the functional recitation "wherein the nucleic acid encoding said polypeptide is 
amplified in adenocarcinomas or squamous cell carcinomas of the lung or in adenocarcinomas of 
the colon." By following the disclosure in the specification, particularly the gene amplification 
assay of Example 170, one skilled in the art could easily test whether a variant PROl 1 1 1 
polypeptide was amplified in adenocarcinomas or squamous cell carcinomas of the lung or in 
adenocarcinomas of the colon. Those variants whose encoding nucleic acids are not amplified in 



Li M.P.E.P. §2164.01 citing In re Certain Limited-charge Cell Culture Microcarriers, 221 USPQ 1165, 
1 174 (Int'l Trade Comm'n 1983), aff sub nom. Massachusetts Institute of Technology v AM. Fortia, 774 F.2d 1 104, 
227 USPQ 428 (Fed. Cir. 1985). 
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lung or colon tumors are not encompassed by the claims. Appellants further submit that the 
claims recite native sequence polypeptide variants. It is understood that many polypeptides and 
especially tumor antigens are known to have different isoforms or variants 24 . One of skill in the 
art would therefore reasonably expect there to be variants of PRO 1 1 1 1 that are also amplified in 
lung or colon tumors. The specification has provided detailed protocols for the gene 
amplification assay, in Example 170, such that one of ordinary skill in the art could identify 
those variants meeting the limitations of the claims, without any undue experimentation. 
Appellants claim only those variants which meet both recitations of the claims. Thus, these 
recitations clearly act to further define the claimed genus of Claims 132 and 133. 

The specification further describes methods for the determination of percent identity 
between two amino acid sequences. (See page 306, line 14, to page 308, line 6). In fact, the 
specification teaches specific parameters to be associated with the term "percent identity" as 
applied to the present invention. The specification further provides detailed guidance as to 
changes that may be made to a PRO polypeptide without adversely affecting its activity. 
(Page 371, line 6, to page 373, line 17). This guidance includes a listing of exemplary and 
preferred substitutions for each of the twenty naturally occurring amino acids. (Table 6, 
page 372). Accordingly, one of skill in the art would be able to identify whether a variant 
PROl 1 1 1 sequence falls within the parameters of the claimed invention. Once such an amino 
acid sequence is identified, the specification sets forth methods for making the amino acid 
sequences (see page 371, line 6, to page 375, line 9) and methods of preparing the PRO 
polypeptides. (See page 375, line 1 1 and onward). 

Therefore, Appellants respectfully submit that the specification provides ample guidance 
such that one of skill in the art could readily test a variant polypeptide. This biological activity 
together with the well defined relatively high degree of sequence identity and general knowledge 
in the art at the time the invention was made, sufficiently defines the claimed genus such that, 



iq Peng et al., Cancer Research, 64:891 1 -89 1 8 (2004); Kiss et al., Anticancer Research 24:3965-3970 
(2004); Perego et a!., Molecular Carcinogenesis 42(4):229-239 (2005); Nagao et al., Genomics 85:462-471 (2005); 
Hong et aL, Cancer Research 64:5504-5510 (2004) (previously submitted). 
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one skilled in the art, at the effective date of the present application, would have known how to 
make and use the claimed polypeptide sequences without undue experimentation. 

Hence Appellants respectfully request reconsideration and reversal of the enablement 
rejection of Claims 132-133 under 35 U.S.C. §112, first paragraph. 

ISSUE 4: 132-133 satisfy the written description requirement of 35 USC §112, first 
paragraph. 

Claims 132-133 are rejected under 35 U.S.C. §112, first paragraph, allegedly because the 
specification does not describe the claimed invention in such a way as to reasonably convey to 
one skilled in the art that the inventors, at the time the application was filed, had possession of 
the claims invention. (Page 5 of the Final Office Action mailed April 7, 2008). 

In particular, the Examiner has taken the position that, while the specification provides 
adequate description for the polypeptide of SEQ ID NO:229, there is insufficient written 
description as to the identity of a polypeptide having at least 95% or 99% sequence identity to 
SEQ ID NO: 1111. The Examiner has asserted that the polypeptide as encompassed with the 
broad definition of 95% of 99% identical to SEQ ID NO:229 are all required to practice the 
instantly claimed invention, and as stated in the previous office action, the specification does not 
provide an adequate written description of the broad genus having potentially highly diverse 
functions as encompassed by the phrase 95% or 99% sequence identity. 

Coupled with the general knowledge available in the art at the time of the invention, 
Appellants submit that the specification provides ample written support for the claimed 
polypeptides. Thus, based on the high percentage of sequence identity, one skilled in the art 
would have known at the time of the invention that the Appellants had possession of the claimed 
polypeptides. 

A. The Legal Test for Written Description 

The well-established test for sufficiency of support under the written description 
requirement of 35 U.S.C. §112, first paragraph is "whether the disclosure of the application as 
originally filed reasonably conveys to the artisan that the inventor had possession at that time of 
the later claimed subject matter, rather than the presence or absence of literal support in the 
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\ 

specification for the claim language. "25- 26 Th e adequacy of written description support is a 

factual issue and is to be determined on a case-by-case basis. 2? The factual determination in a 
written description analysis depends on the nature of the invention and the amount of knowledge 

imparted to those skilled in the art by the disclosure. 2 8> 29 

In Environmental Designs, Ltd. v. Union Oil Co.^0 9 the Federal Circuit held, "Factors 
that may be considered in determining level of ordinary skill in the art include (1) the educational 
level of the inventor; (2) type of problems encountered in the art; (3) prior art solutions to those 
problems; (4) rapidity with which innovations are made; (5) sophistication of the technology; 

and (6) educational level of active workers in the field." (Emphasis added). 31 Further, The 
"hypothetical 'person having ordinary skill in the art' to which the claimed subject matter 
pertains would, of necessity have the capability of understanding the scientific and engineering 

principles applicable to the pertinent art .32, 33 

B. The Disclosure Provides Sufficient Written Description for the Claimed 
Invention 

Appellants respectfully submit that the instant specification evidences the actual 
reduction to practice of the amino acid sequence of SEQ ID NO:229. Thus, the genus of 



^ In re Kaslow, 707 F.2d 1366, 1374, 212 U.S.P.Q. 1089, 1096 (Fed. Cir. 1983). 

26 See also Vas-Cath, Inc. v. Mahurkar, 935 F.2d at 1563, 19 U.S.P.Q.2d at 1 1 16 (Fed. Cir. 1991), 

27 See e.g., Vas-Cath, 935 F.2d at 1563; 19 U.S.P.Q.2d at 1 1 16. 

28 Union Oil v. Atlantic Richfield Co., 208 F.2d 989, 996 (Fed. Cir. 2000). 

29 See also M.P.E.P. §2163 11(A). 

30 713 F.2d 693, 696, 218 U.S.P.Q. 865, 868 (Fed. Cir. 1983), cert, denied, 464 U.S. 1043 (1984). 

31 See also M.P.E.P. §2141.03. 

32 Ex parte Hiyamizu, 10 U.S.P.Q.2d 1393, 1394 (Bd. Pat. App. & Inter. 1988) (emphasis added). 

33 See also M.P.E.P. §2141.03. 
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polypeptides with at least 95% sequence identity to SEQ ID NO:229, would meet the 
requirement of 35 U.S.C. §112, first paragraph, as providing adequate written description. 

Appellants respectfully submit that the instant claims are similar to the exemplary claim 
in Example 10 of the revised Training Manual on Written Description Guidelines issued by the 
U.S. Patent Office. 

Example 10 of the Training Manual clearly states that the protein variants meet the 
requirements of 35 U.S.C. §112, first paragraph, as providing adequate written description for 
the claimed invention even if the specification contemplates but does not exemplify variants of 
the protein if: (1) the procedures for making such variant proteins is routine in the art, (2) the 
specification does not describe the complete structure or physical properties of the variants, 
although those skilled in the art would expect members of the genus to have properties similar to 
those of the reference sequence because of high degree of structural similarity, and (3) the 
variant proteins of the genus possess a significant degree of partial structure (see Claim 2 of 
Example 10). 

Appellants submit that all the requirements in Example 10 are met for the variant 
polypeptides of Claims . In particular, Claims 132-133 require that the variant polypeptide of 
PROl 1 1 1 share a high sequence identity to SEQ ID NO:229. In addition, the procedures of 
making variant polypeptide of SEQ ID NO:229 are well-known in the art and described in detail 
in the specification. The instant specification includes extensive step-by-step guidance in the 
specification on how to make and prepare nucleic acids where the polypeptides have 95% to 
99% identity to the polypeptide of SEQ ID NO: 229. For instance, the specification describes 
methods for the determination of percent identity between two amino acid sequences. In fact, 
the specification teaches specific parameters to be associated with the term "percent identity" as 
applied to the present invention. The specification further provides detailed guidance as to 
changes that may be made to a PRO polypeptide without adversely affecting its activity. This 
guidance includes a listing of exemplary and preferred substitutions for each of the twenty 
naturally occurring amino acids (Table 6). Accordingly, one of skill in the art could identify 
whether a variant PROl 1 1 1 sequence falls within the parameters of the claimed invention. Once 
such an amino acid sequence is identified, the specification sets forth methods for making the 
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amino acid sequences and methods of preparing the PRO polypeptides. Appellants claim only 
those polypeptides which meet the stated guidelines. 

Therefore, Appellants submit that the specification provides ample guidance such that 
one of skilled in the art would know that Appellants possessed the invention as claimed in the 
instant claims, at the time of filing of the application. Accordingly, Appellants respectfully 
request reconsideration and reversal of this outstanding rejection under 35 U.S.C. §112, first 
paragraph. Accordingly, Appellants respectfully request reconsideration and reversal of the 
written description rejection of Claims 132-133 under 35 U.S.C. §112, first paragraph. 

ISSUE 5: Claims 132-133 satisfy the requirements of 35 USC §112, second 
paragraph. 

Claims 132-133 stand rejected under 35 U.S.C. §112, second paragraph, for allegedly 
"being indefinite." The Examiner contends that "the metes and bounds of proteins encoded by 
the nucleic acids 'amplified in adenocarcinomas or squamous cell carcinomas of the lung or in 
adenocarcinomas of the colon' . . .cannot be determined." (Page 7 of the Final Office Action 
mailed April 7, 2008) 

Appellants respectfully disagree. As discussed above under the Enablement and Written 
description issues, Appellants submit that the instant specification provides detailed description 
for identifying the genus of nucleic acids that code for the polypeptide of SEQ ID NO:229 with 
95% similarity and further, which possess the functional property that it is "wherein the nucleic 
acid encoding said polypeptide is amplified in adenocarcinomas or squamous cell carcinomas of 
the lung or in adenocarcinomas of the colon." It also provides step-by-step guidelines and 
protocols for testing these DNA in the gene amplification assay, a PCR based assay at least in 
Example 170. Accordingly, the metes and bounds of proteins encoded by such nucleic acids are 
clearly defined such that one skilled in the art would know how to make the invention. 
Appellants respectfully request that the present rejection be reconsidered and reversal of the 
rejection of Claims 132-133 under 35 U.S.C. §112, second paragraph. 
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ISSUE 6: Claims 124-126, 129 and 132-133 are not anticipated under 35 U.S.C. 
8102(a) by Wang et aL Genbank Accession No. AF196976 (October 1999). 

Claims 124-126, 129 and 132-133 stand rejected under 35 U.S.C. § 102(a) allegedly as 
being anticipated by Wang et al (Genbank Accession No. AF 196976; pub 10/20/1999). (Page 9 
of the Final Office Action mailed April 7, 2008). 

Appellants submit that they rely on the 'gene amplification 1 assay (Example 1 70) for 
patentable utility of the instantly claimed subject matter. This utility was first disclosed in 
Example 23 in the U.S. Provisional Patent Application Serial No. 60/141,037, filed 
June 23, 1999, priority for which has been claimed in this application and relevant pages of 
which have been submitted to the Examiner in a previous response. Appellants are at least 
entitled to an effective filing date of June 23, 1999 based on the results of the 'gene 
amplification' assay for the currently pending claims, and this date precedes the publication date 
for Wang et al. Therefore, Wang et al. is not prior art under 35 U.S.C. § 102(a), and hence this 
rejection should be withdrawn. 

ISSUE 7: Claims 119-123 and 132-133 are not anticipated under 35 U.S.C. §102(a) 
by Jacobs et aL* Genbank Accession No. AAY28806 (October 1999). 

Claims 1 19-123 and 130-133 stand rejected under 35 U.S.C. § 102(a) allegedly as being 
anticipated by Jacobs et al (Genbank Accession No. AAY28806; pub: October 7, 1999) (Page 9 
of the Final Office Action mailed April 7, 2008). 

Claims 1 19-123 have been canceled in the Amendment and Response filed on September 
29, 2006, hence this rejection is moot for these claims. Further, as discussed above, Appellants 
are at least entitled to an effective filing date of June 23, 1999 based on the results of the 'gene 
amplification' assay for the currently pending claims, and this date precedes the publication date 
for Jacobs et al. Therefore, Jacobs et al, is not prior art under 35 U.S.C. § 102(a) and hence this 
rejection should be withdrawn. 

ISSUE 8: Claims 130-133 are not anticipated under 35 U.S.C. §102(a) by Jacobs et 
aU WO 99/50405 (October 1999). 

Claims 130-133 stand rejected under 35 U.S.C. §102(a) allegedly as being anticipated by 
Jacobs et al (WO 99/50405, pub date 10/7/99). Further, the Examiner states that "The reference 
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is silent with respect to whether or not the nucleic acid encodes a protein with chondrocyte 
redifferentiation activity." (Pages 9-10 of the Final Office Action mailed April 7, 2008). 

Applicants submit that they rely on the 'gene amplification' assay (Example 170) not 
chondrocyte redifferentiation assay for patentable utility of the instantly claimed subject matter. 
For the reasons discussed above, Applicants are at least entitled to an effective filing date of 
June 23, 1999 based on the results of the 'gene amplification' assay for the currently pending 
claims, and this date precedes the publication date for Jacobs et al. Therefore, Jacobs et al, is 
not prior art under 35 U.S.C. § 102(a) and hence this rejection should be withdrawn. 

ISSUE 9: Claims 124, 127 and 130-133 are not anticipated under 35 U.S.C. §102(e) 
by Shimkets et aL US Patent No. 6,689,866 (March 2000). 

Claims 124, 127 and 130-133 stand rejected under 35 U.S.C. §102(e) allegedly as being 
anticipated by Shimkets et al (U.S. Patent No. 6,689,866 dated 3/8/00). (Page 10 of the Final 
Office Action mailed April 7, 2008). 

For the reasons discussed above, Applicants are at least entitled to an effective filing date 
of June 23, 1999 based on the results of the 'gene amplification' assay for the currently pending 
claims. Shimkets et al is dated after the effective filing date of June 23, 1999. Therefore, 
Shimkets et al is not prior art and these rejections should be withdrawn. 

ISSUE 10: Claims 130 and 132-133 are not made obvious under 35 U.S.C. §103(a) 
by any one of Loci AI769814, AI435407, AI470931, or T15752 in view of Sibson et aL, WO 
94/01548 (January 1994). 

Claims 130 and 132-133 stand rejected under 35 U.S.C. § 103(a) allegedly as being 
obvious over any one loci AI769814, AI435407, AI470931 or T15752 in view of Sibson et al 
(Pages 11-13 of the Final Office Action mailed April 7, 2008). 

The Examiner has relied on the sequence comparison analysis summarized in the table on 
Page 8 of the Final Office Action mailed April 7 5 2 0 08 to assert AI769814, AI435407, AI470931 
or T15752 as prior art. For the reasons discussed below, Appellants believe that AI769814, 
AI435407, AI470931 or T15752 is not prior art . 

Appellants respectfully remind the Board that the instant case is directed to polypeptides , 
particularly, to the polypeptide of SEQ ID NO:229, and not to nucleic acids. Appellants note 
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that the polypeptide sequences encoded by AI769814, AI435407, AI470931 or T15752 were not 
reduced to practice in the cited art nor did the art provide any disclosure whatsoever of the full- 
length polypeptide encoded by any of these nucleic acid fragments. Hence, this rejection for the 
instant polypeptide case based on nucleic acid ESTs alone is not appropriate and therefore, 
AI769814, AI435407, AI470931 or T15752 are not prior art. 

Appellants would like to point out that locus AI769814 has a publication date of 
December 21 ? 1999. For the reasons discussed above, Appellants are at least entitled to an 
effective filing date of June 23, 1999 based on the results of the 'gene amplification' assay for 
the currently pending claims. Locus AI769814 is dated after the effective filing date of 
June 23, 1999. Therefore, Locus AI769814 is not prior art and these rejections should be 
withdrawn. 

The Examiner alleges that: 1 ) Locus AI7698 1 4 has 1 00% identity to bases 1 703-2 1 80 of 
SEQ ID NO:228; 2) Locus AI435407 has 99.8% identity to bases 1743-2185 of SEQ ID 
NO:228; 3) Locus AI470931 has 100% identity to bases 1795-2179 of SEQ ID NO:228; 4) 
Locus AI769814 has 100% identity to bases 1703-2180 of SEQ ID NO:228; and 5) Locus 
T15752 has 100% identity to bases 1870-2184 of SEQ ID NO:228. (Page 8 of the Final Office 
Action mailed April 7, 2008). The Examiner has further stated that "sequence identity is 
calculated relative to the shorter of the two sequences being compared." (Page 1 1 of the Final 
Office Action mailed April 7, 2008). 

Appellants strongly disagree with the Examiner's calculation of sequence similarity 
because sequence similarity should be calculated by following the definition(s) provided in the 
specification for comparison of sequences, not the Examiner's definition. Indeed, the 
specification describes methods for the determination of percent identity between two nucleic 
acid sequences. (See page 309 to page 310, copy enclosed). In fact, the specification teaches 
specific parameters to be associated with the term "percent identity" as applied to the present 
invention. 

The example shown in Tables 5 (Page 337 of the Specification) are scenarios wherein the 
number of identical nucleotides in a nucleic acid sequence that is being compared to is shorter 
than the full-length of the PRO-DNA nucleic acid sequence. 
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Appellants have made of record the appropriate sequence alignment comparing instantly 
claimed Sequence 1 (the instant applications' SEQ ID NO:228 sequence) with Sequence 2 
(AI769814, AI435407, AI470931 or T15752). (See Response filed January 18, 2008). 

AI769814 

Appellants analysis of the sequence alignment comparing the instantly claimed 
Sequence 1 (the instant applications' SEQ ID NO:228 sequence) with Sequence 2 (AI769814) 
results in 478 identical nucleotides out of the total 2185 nucleotides. (See Alignment I filed with 
Response of January 18, 2008). 

AI435407 

Appellants analysis of the sequence alignment comparing the instantly claimed 
Sequence 1 (the instant applications' SEQ ID NO:228 sequence) with Sequence 2 (AI435407) 
results in 441 identical nucleotides out of the total 2185 nucleotides. (See Alignment II filed 
with Response of January 18, 2008). 

AI470931 

Appellants analysis of the sequence alignment comparing the instantly claimed 
Sequence 1 (the instant applications' SEQ ID NO:228 sequence) with Sequence 2 (AI470931) 
results in 385 identical nucleotides out of the total 2185 nucleotides. (See Alignment III filed 
with Response of January 18, 2008). 

T15752 

Appellants analysis of the sequence alignment comparing the instantly claimed 
Sequence 1 (the instant applications' SEQ ID NO:228 sequence) with Sequence 2 (AI769814) 
results in 359 identical nucleotides out of the total 2185 nucleotides. (See Alignment IV filed 
with Response of January 1 8, 2008). 

Based on the teachings and the clearly disclosed definition of the specification, 
Appellants respectfully submit that the correct percent identity comparing the AI769814 locus 
with the instant applications' SEQ ID NO:228 should be calculated as follows: 

-45- 

On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 09/991,163 
Attorney's Docket No. GNE-2730 P1C17 



(number of matching nucleotides between the two nucleic acid sequence) divided 
by (the total number of nucleotides of the PRO-DNA nucleic acid sequence) = 
(478 divided by 2185) times 100 = 21.87% (see alignment enclosed) 

Based on the above calculations, the correct percent identity comparing the loci 
AI435407, AI470931 and T15752 with the instant applications' SEQ ID NO:228 
results in 20.18%, 17.62% and 16.15% respectively, (see alignments enclosed) 

That is, when calculating the nucleic acid sequence identity with any portion of the 
AI769814, AI435407, AI470931 and T15752, the full-length of SEP ID NO:228 must be used in 
the denominator which only results in 21.87%, 20.18%, 17.62% and 16.15% identities. 

Appellants have submitted the above alignments and sequence identity analysis for the 
Examiner's review even thought the instant case is directed to polypeptides , particularly, to the 
polypeptide of SEQ ID NO:229, and not to nucleic acids. 

The loci AI769814, AI435407, AI470931 and T15752 disclose a sequence that is 
21.87%, 20.18%, 17.62% and 16.15% identical to SEQ ID NO:228. 

As discussed above, loci AI769814, AI435407, AI470931 and T15752 does not disclose 
each and every limitation of Claims 130 and 132-133. Further, Sibson et al., does not cure the 
deficiencies of loci AI769814, AI435407, AI470931 and T15752. Since the primary reference 
falls as a prior art reference, Appellants respectfully submit that the instant claims are not 
obvious over loci AI769814, AI435407, AI47093 1 and T15752 in view of Sibson et al. 

ISSUE 11: Claim 131 is not made obvious under 35 ILS.C. §103(a) by any one of 
Loci AI769814, AI435407, AI470931, or T15752 in view of Sibson et al. and further in view 
of Capon etal.* US Patent No. 5,116,964 (May 1992). 

Claim 131 stands rejected under 35 U.S.C. § 103(a) allegedly as being obvious over any 
one loci AI769814, AI435407, AI470931 or T15752 in view of Sibson et al and further in view 
of U.S. Patent No. 5,116,964 (Capon). (Page 13 of the Final Office Action mailed April 7 5 
2008). 

As discussed above, loci AI769814, AI435407, AI470931 and T15752 do not disclose 
each and every limitation of Claims 131. Further, Sibson et aL, and Capon et aL, do not cure the 
deficiencies of loci AI769814, AI435407, AI470931 and T15752. Since the primary reference 
falls as a prior art reference, Appellants respectfully submit that the instant claims are not 
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obvious over loci AI769814, AI435407, AI470931 and T15752 in view of Sibson et ai, and 
Capon et aL 

ISSUE 12: Claims 130 and 131 are not made obvious under 35 U.S.C. §103(a) by 
Wang et aL, Genbank Accession No. AF196976 (October 1999), in view of Capon et aL, US 
Patent No. 5,116,964 (May 1992). 

Claims 130 and 131 stand rejected under 35 U.S.C. § 103(a) allegedly as being obvious 
over Wang et al } Genbank Accession No. AF 196976 in view of Sibson et aL, and Capon et ah, 
U.S. Patent No. 5,1 16,964. (Pages 13-14 of the Final Office Action mailed April 7, 2008). 

For the reasons discussed above, Appellants are at least entitled to an effective filing date 
of June 23, 1999 based on the results of the 'gene amplification' assay for the currently pending 
claims, and this date precedes the publication date for Wang et aL Therefore, Wang et aL is not 
prior art. Since the primary reference falls as a prior art reference, Appellants respectfully 
submit that the instant claims are not obvious over Wang et aL, in view of Sibson et aL, and 
Capon et aL 

Accordingly, withdrawal of the rejection of Claims 130, 131 and 132-133 under 
35 U.S.C. § 103(a) is respectfully requested. 



-47- 

On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 09/991,163 
Attorney's Docket No. GNE-2730 P1C17 



CONCLUSION 



For the reasons given above, Appellants submit that present specification and the 
specification of U.S. Provisional Patent Application Serial No. 60/141 5 037 5 filed June 23, 1999 
clearly describes and provides at least one patentable utility for the instantly claimed invention. 
Moreover, it is respectfully submitted that the present specification clearly teaches "how to use" 
the presently claimed polypeptide based upon this disclosed patentable utility. Accordingly, the 
primary references cited by the Examiner are not prior art. As such, Appellants respectfully 
request reconsideration and reversal of the outstanding rejection of Claims 124-126 and 129-133. 

The Commissioner is authorized to charge any fees which may be required, including 



extension fees, or credit any overpayment to Deposit Account No. 50-4634 (referencing 
Attorney's Docket No. 123851-181895 (GNE-2730 P1C17) . 



GOODWIN PROCTER LLP 

135 Commonwealth Drive 
Menlo Park, California 94025 
Telephone: (650)752-3100 
Facsimile: (650) 853-1038 



Respectfully submitted, 



Date: January 5, 2009 
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VIII. CLAIMS APPENDIX 

Claims on Appeal 

124. An isolated polypeptide comprising: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:229; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:229 ? lacking its 
associated signal peptide; or 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 2031 10. 

125. The isolated polypeptide of Claim 124 comprising the amino acid sequence of the 
polypeptide of SEQ ID NO:229. 

126. The isolated polypeptide of Claim 124 comprising the amino acid sequence of the 
polypeptide of SEQ ID NO:229, lacking its associated signal peptide. 

129. The isolated polypeptide of Claim 124 comprising the amino acid sequence of the 
polypeptide encoded by the full-length coding sequence of the cDNA deposited under ATCC 
accession number 203 1 1 0. 

130. A chimeric polypeptide comprising a polypeptide according to Claim 124 fused to 
a heterologous polypeptide. 

131. The chimeric polypeptide of Claim 130, wherein said heterologous polypeptide is 
an epitope tag or an Fc region of an immunoglobulin. 

132. An isolated polypeptide comprising an amino acid sequence having at least 95% 
amino acid sequence identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:229; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:229, lacking its 
associated signal peptide; or 
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(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 2031 10. 

133. An isolated polypeptide comprising an amino acid sequence having at least 99% 
amino acid sequence identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:229; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:229, lacking its 
associated signal peptide; or 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203 110. 
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IX. EVIDENCE APPENDIX 

1. Declaration of Audrey Goddard, Ph.D. under 35 C.F.R. §1.132, with attached Exhibits A-G: 

A. Curriculum Vitae of Audrey D. Goddard, Ph.D. 

B. Higuchi, R. et al , "Simultaneous amplification and detection of specific 
DNA sequences," Biotechnology 10:413-417 (1992). 

C. Livak, K.J., et al, "Oligonucleotides with fluorescent dyes at opposite 
ends provide a quenched probe system useful for detecting PGR product 
and nucleic acid hybridization," PCR Methods Appl 4:357-362 (1995). 

D. Heid, C.A. et al, "Real time quantitative PCR," Genome Res. 6:986-994 
(1996). 

E. Pennica, D. et al, "WISP genes are members of the connective tissue 
growth factor family that are up-regulated in Wnt-1 -transformed cells and 
aberrantly expressed in human lung tumors," Proc. Natl Acad Set USA 
95:14717-14722 (1998). 

F. Pitti, R.M. et al , "Genomic amplification of a decoy receptor for Fas 
ligand in lung and lung cancer," Nature 396:699-703 (1998). 

G. Bieche, I. et al, "Novel approach to quantitative polymerase chain 
reaction using real-time detection: Application to the detection of gene 
amplification in breast cancer," Int. J. Cancer 78:661-666 (1998). 

2. Declaration of Avi Ashkenazi, Ph.D. under 35 C.F.R. §1.132, with attached Exhibit A 
(Curriculum Vitae). 

3. Declaration of Paul Polakis, Ph.D. under 37 C.F.R. §1.132. 

4. Orntoft, T.F., et al Molecular & Cellular Proteomics - 1 :37-45 (2002). 

5. Hyman, E., et al, "Impact of DNA Amplification on Gene Expression Patterns in Breast 
Cancer," Cancer Research 62:6240-6245 (2002). 

6. Pollack, J.R., et al, "Microarray Analysis Reveals a Major Direct Role of DNA Copy 
Number Alteration in the Transcriptional Program of Human Breast Tumors," Proc. Natl 
Acad Sci. USA 99:12963-12968 (2002). 

7. Hanna et al, "HER-2/neu Breast Cancer Predictive Testing," Pathology Associates 
Medical Laboratories (1999). 

8. Pennica, D. et al., "WISP genes are members of the connective tissue growth factor 
family that are up-regulated in Wnt- 1 -transformed cells and aberrantly expressed in 
human colon tumors " Proc. Natl. Acad. Sci. USA 83: 4049-52 (1986) 

9. Konopka et al, "Variable Expression of the Translocated c-abl oncogene in Philadelphia- 
chromosome-positive B-lymphoid cell lines from chronic myelogenous leukemia 
patients" Proc. Natl Acad Sci. USA 83: 4049-52, (1986). 
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1 0. Haynes, P. A., et al., "Proteome analysis: Biological assay or data archive?" Electrophoresis 
19:1862-1871 (1996). 

11. Sen, 2000, Curr. Opin. Oncol. 12:8288. 

12. Hu, Y. et al., "Analysis of genomic and proteomic data using advanced literature mining," 
Journal ofProteome Research 2:405-412 (2003). 

13. Declaration of Paul Polakis, Ph.D. under 35 C.F.R. §1.132 (Polakis II). 

14. Godbout, R., et al., J Biol Chem, - 273(33):21 161-8 (1998). 

15. Li etaL, 2006, Oncogene 25: 2628-2635. 

Item 1 was submitted with Appellants' Response filed November 9, 2005, and was considered by 
the Examiner as indicated in the Office Action mailed March 31, 2006. 

Items 2-7 were submitted with Appellants' Response filed November 8, 2004, and were 
considered by the Examiner as indicated in the Office Action mailed May 10, 2005. 

Items 7-1 1 were cited by the Examiner in the Office Action mailed July 6, 2004. 

Item 12 was cited by the Examiner in the Office Action mailed May 10, 2005. 

Items 13-14 were submitted with Appellants' Response filed September 29, 2006, and were 
considered by the Examiner as indicated in the Office Action mailed December 26, 2006. 

Item 15 was cited by the Examiner in the Office Action mailed December 26, 2006. 



-52- 

On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 09/991,163 
Attorney's Docket No. GNE-2730 PI CI 7 



X. RELATED PROCEEDINGS APPENDIX 

None- no decision rendered by a Court or the Board in any related proceedings identified 

above. 



LIBC/3432297.1 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re Application of: Ashkenazi et al. 
Serial No.: 09/903,925 
Filed: My 11,2001 



For: SECRETED AND 

TRANSMEMBRANE . 
POLYPEPTIDES AND NUCLEIC 
ACIDS 



Group Art Unit: 1647 



Examiner: Fozia Hamid 




DECLARATION OF AUDREY D. GODDARD. Ph.D UNDER 37 CF.H S 1.132 

Assistant Commissioner of Patents 
Washington, D.C. 20231 



Sir: 

1, Audrey D. Goddard, Ph.D, do hereby declare and say as follows: 

1 . I am a Senior Clinical Scientist at the Experimental Medicine/BioOncology, Medical 
Affairs Department of Genentech, Inc., South San Francisco, California 94080. 

2. Between 1993 and 2001, 1 headed the DNA Sequencing Laboratory at the Molecular 
Biology Department of Genentech, Inc. During this time, my responsibilities included the 
identification and characterization of genes contributing to the oncogenic process, and determination 
of the chromosomal localization of novel genes. 

3 . My scientific Curriculum Vitae, including my list of publications, is attached to and 
' forms part of this Declaration (Exhibit A). 




Serial No.: * 
Filed: * 

4. I am familiar with a variety of techniques known in the art for detecting and 
quantifying the amplification of oncogenes in cancer, including the quantitative TaqManPCR (i.e., 
"gene amplification") assay described in the above captioned patent application. 

5. The TaqMan PCR assay is described, for example, in the following scientific 
publications: Higuchi et aL, Biotechnology 10:413-417 (1992) (Exhibit B); Livak et aL, PCR 
Methods AppL 4:357-362 (1995) (Exhibit C> and Heid et aL, Genome Res. 6:986-994 (1996) 
(Exhibit D). Briefly, the assay is based on the principle that successful PCR yields a fluorescent 
signal due to Taq DNA polymerase-mediated exonuclease digestion of a fluorescently labeled 
oligonucleotide that is homologous to a sequence between two PCR primers. The extent of 
digestion depends directly on the amount of PCR, and can be quantified accurately by measuring the 
increment in fluorescence that results from decreased energy transfer. This is an extremely sensitive 
technique, which allows detection in the exponential phase of the PCR reaction and, as a result, 
leads to accurate determination of gene copy number. 

6. The quantitative fluorescent TaqMan PCR assay has been extensively and 
successfully used to characterize genes involved in cancer development and progression. 
Amplification of protooncogenes has been studied in a variety of human tumors, and is widely 
considered as having etiological, diagnostic and prognostic significance. This use of the quantitative 
TaqMan PCR assay is exemplified by the following scientific publications: Pennica et aL, Proc. 
Natl. Acad. Sci. USA . 95(25):14717-14722 (1998) (Exhibit E); Pitti et aL, Nature 
396(6712):699-703 (1998) (Exhibit F) andBieche etaL. Int. J. Cancer 78:661-666 (1998) (Exhibit 
G), the first two of which I am co-author. In particular, Pennica et aL have used the quantitative 
TaqMan PCR assay to study relative gene amplification of WISP and c-myc in various cell lines, 
colorectal tumors and normal mucosa. Pitti et aL studied the genomic amplification of a decoy 
receptor for Fas ligand in lung and colon cancer, using the quantitative TaqMan PCR assay. Bieche 
et aL used the assay to study gene amplification in breast cancer. 
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7. It is my personal experience that the quantitative TaqMan PCR technique is 
technically sensitive enough to detect at least a 2-fold increase in gene copy number relative to 
control. It is further my considered scientific opinion that an at least 2-fold increase in gene copy 
number in a tumor tissue sample relative to a normal (i.e., non-tumor) sample is significant and 
useful in that the detected increase in gene copy number in the tumor sample relative to the normal 
sample serves as a basis for using relative gene copy number as quantitated by the TaqMan PCR 
technique as a diagnostic marker for the presence or absence of tumor in a tissue sample of unknown « 
pathology. Accordingly, a gene identified as being amplified at least 2-fold by the quantitative 
TaqMan PCR assay in a tumor sample relative to a normal sample is useful as a marker for the 
diagnosis of cancer, for monitoring cancer development and/or for measuring the efficacy of cancer 
therapy. 

8. I' declare further that all statements made herein of my own knowledge are true and 
that all statements made on information and belief are believed to be true. I declare that these 
statements were made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code, and that such willful false statements may jeopardize the validity of the application or any 
patent issuing thereoa 

Date Audrey D. Goddard, Ph.D. 
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Genentech, Inc. 110 Congo St. 

1 DNA Way - San Francisco, CA, 941 31 

South San Francisco, CA, 94080 415.841.9154 
650.225.6429 415.819.2247 (mobile) 

goddarda@gene.com agoddard@pacbell.net 



PROFESSIONAL EXPERIENCE 

Genentech, Inc. 1993-present 
South San Francisco, CA 

2001 - present Senior Clinical Scientist 

Experimental Medicine / BioOncology, Medical Affairs 

Responsibilities: 

• Companion diagnostic oncology products 

• Acquisition of clinical samples from Genentech 's clinical trials for translational research 

• Translational research using clinical specimen and data for drug development and 
diagnostics 

• Member of Development Science Review Committee, Diagnostic Oversight Team, 21 CFR 
Part 1 1 Subteam 

Interests: 

• Ethical and legal implications of experiments with clinical specimens and data 

• Application of pharmacogenomics in clinical trials 



1998 - 2001 Senior Scientist 

Head of the DNA Sequencing Laboratory, Molecular Biology Department, Research 
Responsibilities: 

• Management of a laboratory of up to nineteen -including postdoctoral fellow, associate 
scientist, senior research associate and research assistants/associate levels 

• Management of a $750K budget 

• DNA sequencing core facility supporting a 350+ person research facility. 

• DNA sequencing for high throughput gene discovery, - ESTs, cDNAs, and constructs 

• Genomic sequence analysis and gene identification 

• DNA sequence and primary protein analysis 

Research: 

• Chromosomal localization of novel genes 

• Identification and characterization of genes contributing to the oncogenic process 

• Identification and characterization of genes contributing to inflammatory diseases 

• Design and development of schemes for high throughput genomic DNA sequence analysis 

• Candidate gene prediction and evaluation 
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1993-1998 



Scientist 



Head of the DNA Sequencing Laboratory, Molecular Biology Department, Research 
Responsibilities 

• DNA sequencing core facility supporting a 350+ person research facility 

• Assumed responsibility for a pre-existing team of five technicians and expanded the group 
into fifteen, introducing a level of middle management and additional areas of research 

• Participated in the development of the basic plan for high throughput secreted protein 
discovery program - sequencing strategies, data analysis and tracking, database design 

• High throughput EST and cDNA sequencing for new gene identification. 

• Design and implementation of analysis tools required for high throughput gene identification. 

• Chromosomal localization of genes encoding novel secreted proteins. 

Research: 

• Genomic sequence scanning for new gene discovery. 

• Development of signal peptide selection methods. 

• Evaluation of candidate disease genes. 

• Growth hormone receptor gene SNPs in children with Idiopathic short stature 

Imperial Cancer Research Fund 1989-1992 
London, UK with Dr. Ellen Solomon 

6/89 -12/92 Postdoctoral Fellow 

• Cloning and characterization of the genes fused at the acute promyelocytic leukemia 
translocation breakpoints on chromosomes 17 and 15. 

• Prepared a successfully funded European Union multi-center grant application 

McMaster University 1983 
Hamilton, Ontario, Canada with Dr. G. D. Sweeney 

5/83 - 8/83: NSERC Summer Student 

• In vitro metabolism of (3-naphthoflavone in C57BI/6J and DBA mice 



EDUCATION 



Ph.D, 



University of Toronto 
Toronto, Ontario, Canada. 
Department of Medical 
Biophysics. 



"Phenotypic and genotypic effects of mutations in 
the human retinoblastoma gene." 
Supervisor: Dr. R. A. Phillips 



1989 



Honours B.Sc 

"The in vitro, metabolism of the cytochrome P-448 
inducer p-naphthoflavone in C57BL/6J mice." 
Supervisor: Dr. G. D, Sweeney 



McMaster University, 
Hamilton, Ontario, Canada. 
Department of Biochemistry 



1983 



( 



( 
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ACADEMIC AWARDS 



Imperial Cancer Research Fund Postdoctoral Fellowship 

Medical Research Council Studentship 

NSERC Undergraduate Summer Research Award 

Society of Chemical Industry Merit Award (Hons. Bioehem.) 

Dr. Harry Lyman Hooker Scholarship 

J.L.W. Gill Scholarship 

Business and Professional Women's Club Scholarship 
Wyerhauser Foundation Scholarship 



1989-1992 
1983-1988 
1983 



1983 



1981-1983 
1981-1982 
1980-1981 
1979-1980 



INVITED PRESENTATIONS 

Genentech's gene discovery pipeline: High throughput identification, cloning and 
characterization of novel genes. Functional Genomics: From Genome to Function, Litchfield 
Park, AZ ? USA. October 2000 

High throughput identification, cloning and characterization of novel genes. G2K:Back to 
Science, Advances in Genome Biology and Technology I. Marco Island, FL, USA. February 



Quality control in DNA Sequencing: The use of Phred and Phrap. Bay Area Sequencing 
Users Meeting, Berkeley, CA, USA. April 1999 

High throughput secreted protein identification and cloning. Tenth International Genome 
Sequencing and Analysis Conference, Miami, FL, USA. September 1998 

The evolution of DNA sequencing: The Genentech. perspective. Bay Area Sequencing Users 
Meeting, Berkeley, CA, USA. May 1998 

Partial Growth Hormone Insensitivity: The role of GH-receptor mutations in Idiopathic Short 
Stature. Tenth Annual National Cooperative Growth Study Investigators Meeting, San 
Francisco, CA, USA. October, 1996 

Growth hormone (GH) receptor defects are present in selected children with non-GH-deficient 
short stature.: A molecular basis for partial GH-insensitivity. 76 th Annual Meeting of The 
Endocrine Society, Anaheim, CA, USA. June 1994 

A previously uncharacterized gene, myl, is fused to the retinoic acid receptor alpha gene in 
acute promyelocytic leukemia. XV International Association for Comparative Research on 
Leukemia and Related Disease, Padua, Italy. October 1991 



2000 
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PATENTS 

Goddard A, Godowski P J, Gurney AL. NL2 Tie ligand homologue polypeptide. Patent 
Number: 6,455,496. Date of Patent: Sept. 24, 2002. 

Goddard A, Godowski PJ and Gurney AL. NL3 Tie ligand homologue nucleic acids. Patent 
Number: 6,426,218. Date of Patent: July 30, 2002. 

Godowski P, Gurney A, Hillan KJ, Botstein D, Goddard A, Roy M, Ferrara N, Tumas D, 
Schwall R. NL4 Tie ligand homologue nucleic acid. Patent Number: 6,4137,770. Date of 
Patent: July 2, 2002. 

Ashkenazi A, Fong S, Goddard A, Gurney AL, Napier MA, Tumas D, Wood WL Nucleic acid 
encoding A-33 related antigen poly peptides. Patent Number: 6,410,708. Date of Patent:: 
Jun. 25, 2002. 

Botstein DA, Cohen RL, Goddard AD, Gurney AL, Hillan .KJ, Lawrence DA, Levine AJ, 
Pennica D, Roy MA and Wood Wl. WISP polypeptides and nucleic acids encoding same! 
Patent Number: 6,387,657. Date of Patent: May 14, 2002. 

Goddard A, Godowski PJ and Gurney AL. Tie ligands. Patent Number: 6,372,491 Date of 
Patent: April 16,. 2002. 

Godowski PJ, Gurney AL, Goddard A and Hillan K. TIE ligand homologue antibody. Patent 
Number: 6,350,450. Date of Patent: Feb. 26, 2002. 

Fong S, Ferrara N, Goddard A, Godowski PJ t Gurney AL, Hillan K and Williams PM. Tie 
receptor tyrosine kinase ligand homoiogues. Patent Number: 6,348,351. Date of Patent* 
Feb. 19, 2002. 

Goddard A, Godowski PJ and Gurney AL. Ligand homoiogues. Patent Number: 6 348 350 
Date of Patent: Feb. 19, 2002. 

Attie KM, Carlsson LMS, Gesundheit N and Goddard A. Treatment of partial growth 
hormone insensitivity syndrome. Patent Number: 6,207,640. Date of Patent: March 27 
2001. . 

Fong S, Ferrara N, Goddard A, Godowski PJ, Gurney AL, Hillan K and Williams PM. Nucleic 
acids encoding NL-3. Patent Number: 6,074,873. Date of Patent: June 13, 2000 

Attie K, Carlsson LMS, Gesunheit N and Goddard A.. Treatment of partial growth hormone 
insensitivity syndrome. Patent Number: 5,824,642. Date of Patent: October 20, 1998 

Attie K, Carlsson LMS, Gesunheit N and Goddard A. Treatment of partial growth hormone 
insensitivity syndrome. Patent Number: 5,646,113. Date of Patent: July 8, 1997 

Multiple additional provisional applications filed 
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PUBLICATIONS 

Seshasayee D, Dowd P, Gu Q, Erickson S, Goddard AD" Comparative sequence analysis of 
the HER2 locus in mouse and man. Manuscript in preparation. 

r 

Abuzzahab MJ, Goddard A, Grigorescu F, Lautier C, Smith RJ and Chernausek SD. Human 
IGF-1 receptor mutations resulting in pre- and post-natal growth retardation. Manuscript in 
preparation. 

Aggarwal S, Xie, M-H, Foster J, Frantz G, Stinson J ? Corpuz RT, Simmons L, Milan K, 
Yansura DG, Vandlen RL, Goddard AD and Gurney AL FHFR, a novel receptor for the 
fibroblast growth factors. Manuscript.submitted. 

Adams SH, Chui C, Schilbach SL, Yu XX, Goddard AD, Grimaldi JC f Lee J, Dowd P, Colman 
S., Lewin DA. (2001) BFIT, a unique acyl-CoA thioesterase induced in thermogenic brown 
adipose tissue: Cloning, organization of the human gene, and assessment of a potential link 
to obesity. Biochemical Journal 360: 135-142. 

Lee J. Ho WH. Maruoka M, Corpuz RT. Baldwin DT. Foster JS. Goddard AD. Yansura DG. 
Vandlen RL. Wood W I. Gurney AL. (2001) IL-17E, a novel proinflammatory ligand for the IL- 
17 receptor homolog IL-17RM. Journal } of Biological Chemistry 276(2): 1660-1664. 

Xie M-H, Aggarwal S t Ho W-H, Foster J, Zhang Z, Stinson J, Wood Wl, Goddard AD and 
Gurney AL. (2000) Interleukin (IL)-22, a novel human cytokine that signals through the 
interferon-receptor related proteins CRF2-4 and IL-22R. Journal of Biological Chemistry 275: 
31335-31339. 

Weiss GA, Watanabe CK, Zhong A, Goddard A and Sidhu SS. (2000) Rapid mapping of 
protein functional epitopes by combinatorial alanine scanning. Proc. NatL Acad. Sci. USA 97: 
8950-8954. 

Guo S, Yamaguchi Y, Schilbach S t Wada T.;Lee J t Goddard A, French D , Handa H, 
Rosenthal A. (2000) A regulator of transcriptional elongation controls vertebrate neuronal . 
development. Nature 408: 366-369. 

Yan M,' Wang L-C, Hymowitz SG, Schilbach S, Lee J, Goddard A, de Vos AM, Gao WQ, Dixit 
VM. (2000) Two-amino acid molecular switch in an epithelial morphogen that regulates 
binding to two distinct receptors. Science 290: 523-527. 

Sehl PD, Tai JTN, Hillan KJ, Brown LA, Goddard A, Yang R, Jin H and Lowe DG. (2000) 
Application of cDNA microarrays in determining molecular phenotype in cardiac growth, 
development, and response to injury. Circulation 101: 1990-1999. 

Guo S, Brush J, Teraoka H, Goddard A, Wilson SW, Mullins MC and Rosenthal A. (1999) 
Development of noradrenergic neurons in the zebrafish hindbrain requires BMP, FGF8, and 
the homeodomain protein soulless/Phox2A. Neuron 24: 555-566. 

Stone D, Murone, M, Luoh, S- Ye W, Armanini P, Gurney A, Phillips HS, Brush, J, Goddard 
A, de Sauvage FJ and Rosenthal A. (1999) Characterization of the human suppressor of 
fused; a negative regulator of the zinc-finger transcription factor Gli. J. Cell Sci. 112: 4437- 
4448. 

Xie M-H, Holcomb I, Deuel B, Dowd P, Huang A, Vagts A, Foster J, Liang J, Brush J, Gu Q, 
Hillan K, Goddard A and Gurney, A.L. (1999) FGF-19, a novel fibroblast growth factor with' 
unique specificity for FGFR4. Cytokine 1 1 : 729-735. 
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Yan M ( Lee J, Schilbach S, Goddard A and Dixit V. (1999) mE10, a novel caspase 
recruitment domain-containing proapoptotic molecule. J. Biol. Chem. 274(15): 10287-10292. 

Gurney AL, Marsters SA, Huang RM, Pitti RM, Mark DT, Baldwin DT, Gray AM, Dowd P, 
Brush J, Heldens S, Schow P, Goddard AD, Wood.WI, Baker KP, Godowski PJ and 
Ashkenazi A. (1 999) Identification of a new member of the tumor necrosis factor family and its 
receptor, a human ortholog of mouse GITR. Current Biology 9(4): 215-218. 

Ridgway JBBi, Ng E, Kern JA ,Lee J t Brush J, Goddard A and Carter P. (1999) Identification 
of a human anti-CD55 single-chain Fv by subtractive panning of a phage library using tumor 
and nontumor cell lines. Cancer Research 59: 2718-2723. 

Pitti RM, Marsters SA, Lawrence DA, Roy M, Kischkel FC, Dowd P, Huang A, Donahue CJ, 
Sherwood SW, Baldwin DT, Godowski PJ, Wood Wl, Gurney AL, Hillan KJ, Cohen RL, 
Goddard AD, Botstein D and Ashkenazi A (1998) Genomic amplification of a decoy receptor 
for Fas ligand in lung and colon cancer. Nature 396(6712): 699-703. 

Pennica D, Swanson TA, Welsh JW, Roy MA, Lawrence DA, Lee J, Brush J, Taneyhill LA, 
Deuel B, Lew M, Watanabe C, Cohen RL, Melhem MF, Finley GG, Quirke P, Goddard AD, 
Hillan KJ, Gurney AL, Botstein D and Levine AJ. (1998) WISP genes are members of the 
connective tissue growth factor family that are up-regulated in wnt-1 -transformed cells and 
aberrantly expressed in human colon tumors. Proc. Natl. Acad. ScL USA. 95(25): 14717- 
14722. 

Yang RB, Mark MR, Gray A, Huang A, Xie MH, Zhang M, Goddard A, Wood Wl, Gurney AL 
and Godowski PJ. (1998) Toll-like receptor-2 mediates lipopolysaccharide-induced cellular 
signalling. Nature 395(6699): 284-288. 

Merchant AM, Zhu Z, Yuan JQ, Goddard A, Adams CW, Presta LG and Carter P. (1998) An 
efficient route to human bispecific IgG. Nature Biotechnology 16(7): 677-681 . 

Marsters SA, Sheridan JP, Pitti RM, Brush J, Goddard A and Ashkenazi A, (1998) 
Identification of a ligand for the death-domain-containing receptor Apo3. Current Biology 8(9V 
525-528. 

Xie J, Murone M, Luoh SM, Ryan A, Gu Q, Zhang C, Bonifas JM, Lam CW, Hynes M, 
Goddard A, Rosenthal A, Epstein EH Jr. and de Sauvage FJ. (1998) Activating Smoothened 
mutations in sporadic basal-cell carcinoma. Nature. 391(6662): 90-92. 

Marsters SA, Sheridan JP, Pitti RM, Huang A, Skubatch M, Baldwin D, Yuan J, Gurney A, 
Goddard AD, Godowski P and Ashkenazi A. (1997) A novel receptor for Apo2L/TRAIL 
contains a truncated death domain. Current Biology. 7(12): 1003-1006. 

Hynes M, Stone DM, Dowd M, Pitts-Meek S, Goddard A, Gurney A and Rosenthal A (1997) 
Control of cell pattern in the neural tube by the zinc finger transcription factor Gli-1. Neuron 
19: 15-26. 

Sheridan JP, Marsters SA, Pitti RM, Gurney A, Skubatch M, Baldwin D, Ramakrishnan L, 
Gray CL, Baker K, Wood Wl, Goddard AD, Godowski P, and Ashkenazi A (1997) Control of 
TRAIL-lnduced Apoptosis by a Family bf Signaling and Decoy Receptors. Science 277 
(5327): 818-821. 
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Goddard AD, Dowd P, Chernausek S, Geffner M, Gertner J, Hintz R t Hbpwood N, Kaplan S, 
Plotnick L, Rogol A, Rosenfield R, Saenger P, Mauras N, Hershkopf R, Angulo M and Attie, K. 
(1997) Partial growth hormone insensitivity: The role of growth hormone receptor mutations in 
idiopathic short stature. J. Pediatr. 131: S51-55. 

Klein RD, Sherman D, Ho WH, Stone D, Bennett GL, Moffat B, Vandlen R, Simmons L ? Gu Q ( 
Hongo JA, Devaux B, Poulsen K, Armanini M t Nozaki C, Asai N, Goddard A, Phillips H t 
Henderson CE, Takahashi M and Rosenthal A. (1997) A GPI-linked protein that interacts with 
Ret to form a candidate neurturin receptor. Nature. 387(6634): 717-21. 

Stone DM, Hynes M, Armanini M, Swanson TA, Gu Q, Johnson RL, Scott MP, Pennica D, 
Goddard A, Phillips H, Noll M, Hooper JE, de Sauvage F and Rosenthal A. (1996) The 
tumour-suppressor gene patched encodes a candidate receptor for Sonic hedgehog. Nature 
3B4(6605): 129-34. 

Marsters SA, Sheridan JP, Donahue CJ, Pitti RM, Gray CL, Goddard AD, Bauer KD and 
Ashkenazi A. (1 996) Apo-3, a new member of the tumor necrosis factor receptor family, 
contains a death domain and activates apoptosis and NF-kappa p. Current Biology 6(12): 
1669-76. 

Rothe M, Xiong J, Shu HB, Williamson K, Goddard. A and Goeddel DV. (1996) l-TRAF is a 
novel TRAF-interacting. protein that regulates TRAF-mediated signal transduction. Proc Natl. 
Acad, ScL USA 93: 8241-8246. 

Yang M, Luoh SM, Goddard A, Reilly D, Henzel W and Bass.S. (1996) The bglX gene 
located at 47.8 min on the Escherichia coli chromosome encodes a periplasmic beta- 
glucosidase. Microbiology 142: 1659-65. 

Goddard AD and Black DM. (1996) Familial Cancer in Molecular Endocrinology of Cancer. 
Waxman, J. Ed. Cambridge University Press, Cambridge UK, pp.1 87-21 5. 

Treanor JJS, Goodman L, de Sauvage F, Stone DM, Poulson KT, Beck CD, Gray C, Armanini 
MP, Pollocks RA, Hefti F, Phillips HS, Goddard A, Moore MW, Buj-Bello A, Davis AM, Asai N. 
Takahashi M, Vandlen R, Henderson CE and Rosenthal A. (1996) Characterization of a 
receptor for GDNF. Nature 382: 80-83. 

Klein RD, Gu Q, Goddard A and Rosenthal A. (1996) Selection for genes encoding secreted 
proteins and receptors. Proc. Natl. Acad. ScL USA 93: 7108-7113. 

Winslow JW, Moran P, Valverde J, Shih A, Yuan JQ, Wong SC, Tsai SP t Goddard A, Henzel 
WJ, Hefti F and Caras I. (1995) Cloning of AL-1, a ligand for an Eph-related tyrosine kinase 
receptor involved in axon bundle formation. Neuron 14: 973-981 . 

Bennett BD, Zeigler FC, Gu Q, Fendly B, Goddard AD, Gillett N and Matthews W. (1995) 
Molecular cloning of a ligand for the EPH-related receptor protein-tyrosine kinase Htk. Proc. 
Natl. Acad. ScL USA 92: 1866-1870. 

Huang X, Yuang J, Goddard A, Foulis A, James RF, Lernmark A, Pujol-Borrell R, 
Rabinovitch A, Somoza N and Stewart TA. (1995) Interferon expression in the pancreases of 
patients with type I diabetes. Diabetes 44: 658-664. 

Goddard AD, Yuan JQ, Fairbairn L, Dexter M, Borrow J, Kozak C and Solomon E. (1995) 
Cloning of the murine hornolog of the leukemia-associated PML gene. Mammalian Genome 
6: 732-737. 
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Goddard AD, Covello R; Luoh SM, Clackson T, Attie KM, Gesundheit N, Rundle AC, Wells 
JA, Carlsson LMTI and The Growth Hormone Insensitivity Study Group. (1995) Mutations of 
the growth hormone receptor in children with idiopathic short stature. N. Engl. J. Med 333* 
1093-1098. 

Kuo SS, Moran P, Gripp J, Armanini M, Phillips HS, Goddard A and Caras IW. (1994) 
Identification and characterization of Batk, a predominantly brain-specific non-receptor protein 
tyrosine kinase related to Csk. J. NeuroscL Res. 38: 705-715. 

Mark MR, Scadden DT, Wang Z, Gu Q, Goddard A and Godowski PJ. (1994) Rse, a novel 
receptor-type tyrosine kinase with homology to Axl/Ufo, is expressed at high levels in the 
brain. Journal of Biological Chemistry 269: 10720-10728. 

Borrow J, Shipley J, Howe K, Kiely F, Goddard A, Sheer D, Srivastava A, Antony AC, 
Fbretos T, Mitelman F and Solomon E. (1994) Molecular analysis of simple variant 
translocations in acute promyelocytic leukemia. Genes Chromosomes Cancer 9: 234-243. 

Goddard AD and Solomon E. (1993) Genetics of Cancer. Adv. Hum. Genet 21: 321-376. 

Borrow J, Goddard AD, Gibbons B, Katz F, Swirsky D, Fioretos T, Dube I, Winfield DA, 
Kingston J, Hagemeijer A, Rees JKH, Lister AT and Solomon E: (1992) Diagnosis of acute 
promyelocytic leukemia by RT-PCR: Detection of PML-RARA and RARA-PML fusion 
transcripts. Br. J. Haematol. 82: 529-540. 

Goddard AD, Borrow J and Solomon E. (1992) A previously uncharacterized gene, PML, is 
fused to the retinoic acid receptor alpha gene in acute promyelocytic leukemia. Leukemia 6 
Suppl3: 117S-119S. 

Zhu X, Dunn JM, Goddard AD, Squire JA, Becker A, Phillips RA and Gallie BL. (1992) 
Mechanisms of loss of heterozygosity in retinoblastoma. Cytogenet. Cell. Genet 59: 248-252. 

Foulkes W, Goddard A. and Patel K. (1991) Retinoblastoma linked with Seascale [letter]. 
. British Med. J. 302: 409. 

Goddard AD, Borrow J, Freemont PS and Solomon E. (1991) Characterization of a novel zinc 
finger gene disrupted by the t(15;17) in acute promyelocytic leukemia. Science 254: 1371- 
1374. 

Solomon E, Borrow J and Goddard AD. (1991) Chromosomal aberrations in cancer. Science 
254:1153-1160. 

Pajunen L, Jones TA, Goddard A, Sheer D, Solomon E, Pihlajaniemi T and Kivirikko Kl. 
(1991) Regional assignment of the human gene coding for a multifunctional peptide (P4HB) 
acting as the p-subunit of prolyl-4-hydroxylase and the enzyme protein disulfide isomerase to 
17q25. Cytogenet Cell. Genet 56: 165-168. 

Borrow J, Black DM, Goddard AD, Yagle MK, Frischauf A.-M and Solomon E. (1991) 
Construction and regional localization of a Not\ linking library from human chromosome 17q> 
Genomics 10: 477^80. 

Borrow J, Goddard AD t Sheer D and Solomon E. (1990) Molecular analysis of acute 
promyelocytic leukemia breakpoint cluster region on chromosome 17. Science 249: 1577- 
1580. 
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Myers JC, Jones TA, Pohjolainen E-R, Kadri AS, Goddard AD, Sheer D, Solomon E and 
Pihlajaniemi T. (1990) Molecular cloning of 5(IV) collagen and assignment of the gene to the 
region of the region of the X-chromosome containing the Alport Syndrome locus. Am. J. Hum. 
.Genet 46: 1024-1033. 

Gallie BL, Squire JA, Goddard A, Dunn JM, Canton M, Hinton D, Zhu X and Phillips RA, 
(1990) Mechanisms of oncogenesis in retinoblastoma. Lab. Invest. 62: 394-408. 

Goddard AD, Phillips RA t Greger V, Passarge E, Hopping W, Gallie BL and Horsthemke B. 
(1990) Use of the RB1 cDNA as a diagnostic probe in retinoblastoma families. Clinical 
Genetics 37: 117-126. 

Zhu XP, Dunn JM, Phillips RA, Goddard AD, Paton KE, Becker A and Gallie BL (1989) 
Ggrmline, but not somatic, mutations of the RB1 gene preferentially involve the paternal 
allele. Nature 340: .31 2-314. 

Gallie BL, Dunn JM, Goddard A, Becker A and Phillips RA. (1988) Identification of mutations 
in the putative retinoblastoma gene. In Molecular Biology of The Eve: Genes, Vision and 
Ocular Disease . UCLA Symposia on Molecular and Cellular Biology, New Series, Volume 88. 
J. Piatigorsky, T. Shinohara and P.S. Zelenka, Eds. Alan R. Liss, Inc., New York, 1988, pp. 
427-436. 

Goddard AD, Balakier H, Canton M, Dunn J, Squire J, Reyes E, Becker A, Phillips RA and 
Gallie BL. (1988) Infrequent genomic rearrangement and normal expression of the putative 
RB1 gene in retinoblastoma tumors. Mol. Cell. Biol. 8: 2082-2088. 

Squire J, Dunn J, Goddard A, Hoffman T, Musarella M, Willard HF, Becker AJ, Gallie BL and 
Phillips RA. (1986) Cloning of the esterase D gene: A polymorphic gene probe closely linked 
to the retinoblastoma locus on chromosome 13. Proc. Natl. Acad. Sci. USA 83: 6573-6577. 

Squire J, Goddard AD, Canton M, Becker A, Phillips RA and Gallie BL (1986) Tumour 
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We have enhanced the polymerase chain 
reaction (PCJR) such that specific DNA 
sequences can be detected without open- 
ing the reaction tube. This enhancement 
requires the addition of ethidium bromide 
(EtBr) to a PGR. Since the fluorescence of 
EtBr increases in the presence of double- 
stranded (ds) DNA an increase in fluores- 
cence in such a PGR indicates a positive 
amplification, which can be easily mani* 
toted externally* In fact, amplification can 
be continuously monitored in order to 
follow its progress. The ability to simulta- 
neously amplify specific DNA sequences 
and detect the product of the amplification 
both simplifies and improves PGR and 
may facilitate its automation and more 
widespread use in the clinic or in other 
situations requiring high sample through- 
put 

Although the potential benefit* of MR 1 toclitj- 
teal diagnostics arc well lu*dwr* 2! \ it is still not 
widely used ir* this setting, even though it is 
font- years ciuco therov?*tt.blft DMA polymer- 
ases* made PCR practical. Some of the reasons for its slow, 
acceptance are high cost, Jack of automation of pre-* and 
post-PCR processing steps, and false positive results, from 
carryover-contamination. The first two points arc related 
in that Jabot is the largest contributor to cost at the present 
stage of PCR development Most current assays require 
some form of "downstream" processing once tbermocy* 
dbg is done in order *o determine whether the target 
DNA sequence was- present and has amplified. Thefte 
include DNA hybridfeatiorr**, gel electrophoresis with or 
without use of restriction digestion 7 ;* HPtC?, or capillary 
electrophoresis 1 °> These methods are labor-intense, have, 
low throughput, and arc difficult to automate. The third 
point is also closely related to downstream processing. 
The handling of the PCS. product in these downstream 
processes increases the chances that ampJSBed DNA .will 
spread through the typing* lab, resulting in • a .risk of 



"carryover" false positives in subsequent testing". 

These downstream processing steps would be elimi- 
nated if specific amplification and detection of amplified 
DNA took place simultaneously within an unopened re- 
action vessel Assays m which such different processes take 
place without , the uced to separate reaction components 
have been termed "homogeneous* 1 . No truly hbmogc-. 
rieous PCR assay has been demonstrated to date, although 
progress towards this end has been reported/ Chehab, et 
aL , developed a PCR product detection scheme using 
fluorescent primers that resulted in a fiuorcscent PCR 
product AUcle-specific primers, each with different fluo- 
rescent tags, were used to indicate the genotype of the 
DNA. However, the unincorporated primers must stiif be 
removed in a downstream process in order to visually the 
result Recently, Holland, et al lS i developed an assay in 
winch the endogenous 5' <:xonuclease assay of Ta^ DNA 
polymerase was exploited to cleave a labeled oligonucleo- 
tide probe. Hie probe would only dcave if PCR ampli- 
cation had produced its complementary sequence. In 
order to detect the dcavage products, however, a subse- 
quent process is again needed* 

We have developed a truly homogeneous assay for PCR 
and PCR product detection based upon the greatiy in- 
creased fluorescence that ethidiunx bromide and other 
DNA binding dyes exhibit when they are bound .to ds- 
DNA t4: ~ ie . As oudinccx in Figure h a proiotypic PCR 
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RQVU 1 Principle of simultancoua ampUficzncn and- detection of 
PCR product: The component* of a PCR containing EtBr chat arc 
fluorescent are listed— &Br itscJ f, EtBr bound to €*hcr ssDN A or 
dsDN A. There in a large fluorescence enhancement when EtBr is 
bound to DNA and binding is gvcatfy enhanced when DNA is 
doublc-wmdcd. After su&cient (n)..cydcs of PGR* the .net 
increase 'in d.ipNA reside in additional EtBr bontfin^, and 3 net 
increase in total fluorescence: 
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WBTOf J Gel elcctropJtoresis of t*CR Amplification products of the 
human, ttudcar gene, HLA DQtx, made in the presence of 
increasing amounts of EtBr (up to 8 Mgftnl)* The presence of 
Etfcr Jjas no obvious effect on titc yield or spcdfi.dty of amplifi- 
cation. 



A. 



MM* 



B. 





FIGBJff % (A) Huotcsccnce measurement* from PCRs that contain. 
0,5 p.g/m3 EtBr and that are specific for Y*chromo$oinc repeat 
jeqoeticc*. Five replicate PCRs *ere begun containing each of the 
DNA* speared. At each indicated cycle, one of the five replicate 
PCRs for each DNA "Wds removed from thermocycKng and tts 
fluorescence measured. Units of fluorescence arc arbitrary. (B) 
UV photography of PCR tube* (0.5 ml Eppcndorf^tylc, pcJypro* 
pylcne m,tcro^enirifugc :tubca) containing reactions! those start* 
rag from 2 ngraale DNA and control reactions without any DNA f 
from (A). 



begins with primers that are single-stranded DNA (ss- 
DNA)i dNTPs, and DNA polymerase! An amount of 
dsDNA containing the target sequence' (target DNA) is 
also typically present This amount can vary, depending 
on the application, from single-cell amounts of DNA 17 to 
micrograms per PCR^ 8 , If EtBr is present* the reagents 
that will fluoresce, in order of increasing fluorescence, are 
free EtBr itself* and EtBr bound to the aingk-etranded 
DNA primers and to the double-stranded target DNA (by 
its intercalation between the stacked bases of the DNA 
doublc-hciut)* After the first denatu ration cycle, target 
DNA will be largely single-stranded. After a FCR ia 
completed! the most significant change is the increase in 
the amount of dsDNA (the PCR product itself) of up .to 
several micrograms. Formerly free EtBr is bound to the 
addttiotial dsDNA* resulting in an increase m fluores- 
cence. There is also some decrease -in the amount of 
ssDNA primer, but because tbe binding of EtBr to ssDNA 
is much less than to dsDNA, the effect of this change on 
the total HuorcKcricc of the sample is small. The fluores- 
cence increase can be measured by oUrecting cxertarioa 
I illumination through the walls of the amplification vessel 



before and after, or even continuously during, mermocy- 
RESULTS 

PCR in the presence of EtBr. In order to assess th$ 
affect of EtBr to PCR, amplifications of the human Hl„A 
DQet gene* 9 were performed with the dye present at 
concentrations from 0.06 to 8.0 iLgfml (a typical concen- 
tration of EtBr used tn staining of nuctek acids fattening 
g^l electrophoresis Is 0.5 u.g/ml). As shown in Figure 2, g| 
electrophoresis revealed little or no difference in the yield 
or quality of the amplification product whether EtBr was 
absent or present at any of these concentrations, indicat- 
ing; that EtBr does not inhibit PCR. 

Detection of human Y-chromosorac specific w 
cpencest SequetKe-spcciric, fluorescence enhancement of 
EtBr as a result of FCR was demonstrated in a scries of 
amplifications containing 0.5 jxgfrnl EtBr and primers 
specific to repeat D^'A sequences found on the humatt 
Y<hromc$omc*°. These PCRs initially contained either 
60 ng male. 60 tig female, 2 ng roak human or no DNA. 
live replicate PCRs were begun for each JDNA* After 
1 7, 31 , 24 and 29 cycles of therxuocyding, a FCR for each 
DNA was removed from the thermocycJer, and its fluo- 
rescence measured in a spectrofmorometer and plotted 
V5, amplification cyde number (Fig. 3A). The shape of this 
curve reflects the fact that by the uine an increase in 
fluorescence can be detected, the increase in DNA is 
becoming linear and not exponential with cyde number: 
As shown, the fluorescence increased about three-fold 
over the background fluorescence for the PCRs contain* ' 
ing human male DNA, but did not significantly increase 
for negative control PCRs, which contained either no 
DNA or human female DNA. The more male DNA 
present to begin wkh~60 ng versus 2 ng— the fewer 
cycles were needed to give a detectable increase in fluo- 
rescence. Gel eJecttttpnoresis oo the products of these 
amplifications showed that DNA fragments of the ex* 
pectcd s&c were made in the male. DNA containing 
reactions and that tittle DN A synthesis took place in the 
control samples, 

in addition, the increase in fluorescence wa* visualized 
by simply laying the completed, unopened PCRs on a UV 
transilhirninatOT and photographing mem through a red' 
niter. This is' shown in figure 3B for the reactions thai 
began with 2 ng male DNA and those with no DNA. 

Detection of specific aBclca of the human $-globm 
gene. In order to demonstrate that this approach has 
adequate specificity to allow genetic screening, a detection 
of the' SKkfe-ccU anemia mutation was performed Figure 
4 shows the fluorescence from completed amplication* 

containing EtBr (0.5 pgtad) detected by photography 

of the reaction tubes on a UV transuiuminator. These 
reactions were performed using- primer* specific for ci- 
ther the. v^-type or sickle-cell mutation of the human 
^globin gene". The *pcdfkity for each, allele i$ imparted 
by piacnig the sickle-mutation site at the terminal 3' 
nucleotide of one primer. By using an appropriate primer 
annealing tempera ture t primer extension — and thus am- 
ptiBcation— -can take place only if the 5' nucleotide of the 
primer is complementary to the $-giobin aDdc present* 1 ^ 
. Each pair of amplications shown in Figure 4 consists of 
a reaction with etcher the wiM-typc allele specific (left 
tube) or sfckic-aUeie specific (right tube) primers. Three 
different DNAs were typed: DNA from a homozygous, 
wild-type $-globin individual (AA); from a heterozygous 
sickle ^-glpbin individual (AS); and from a homozygous 
sickle P-giobio wdividual (SS). Each DNA (50 ng genomic 
DNA to start each PGR) was analyzed in 'triplicate (3 pairs 
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0 f reactions each). The DNA .type vas reflected m the 1 
^latWe fluOtcseexiee intensities in each pair of completed 
flm pli£catioite. There was a significant increase in fluores- 
ce* only where a P-globin allele DNA matched the 
primcf set. Wheo measured • on. a spcctrcflnoronietcr 
Mata not shown), this fluorescence was about three times 
jrtt present in a PCR where both p-globm alleles were 
^Hatched to the primer set* Gel clcctrpphafetfc (not 
phown) e$rablj$hcd that this increase in fluorescence was 
due to the synthesis of nearly a microgram of a DNA 
fragment of the expected size for p*globin. There was 
little synthesis of dsDNA in reactions in. which the aliele- 
specific primer was mismatched to both alleles. 

Continuous snoxHtoring of a PCEt Using a fiber optic 
devkcrit i$ possible to dh'ect excitation illumination from 
p spectrofluotometer to a PGR undergoing thcrmocydiog 
and to return its fluorescence to the spectroftuorometer. 
TJic fluorescence readout of such an arrangement, de- 
tected at an Et£r*containing amplification of Y-chromo 
some specific sequences from 25 ng of human male DNA* 
is shown in Figure 5. The readout from a control t*CR 
with no target DNA is also shown. Thirty cycles of PCR 
were monitored for each. 

The fluorescence trace as a function of time ckarly 
shows the effect of the thermocycling. Fluorescence inten- 
tly rises and. fails inversely with temperature. The fluo- 
rescence intensity is minimum at the denaturation tem- 
perature (94°C) and maximum attheanncaUn^extemsion 
temperature (50°C>. In the negative-control rCR, these 
fluorescence maxima and minima do not change signifi- 
cantly over the thirty tbarmocycJes, indicating that there is 
little dsDNA synthesis without the appropriate target 
DNA, and there is litde if any bleacruns of EtBr during 
the continuous ilhiminarion Of the sample. 

Jn the PCR containing male DNA, the fluorescence 
maxima at the annealing/extension temperature begin to 
increase at about 4000 seconds of thermocycling, and 
continue to increase with time* indicating that dsDNA is 
being produced at a detectable level Note that the fluo- 
rescence minima at the denaturatiou temperature do not 
Mgtuficantiy increase* presumably because al this temper- 
ature there is no d&DNA for EtBr to bind. Thus the course 
of the amplification is followed by tracking the fluores- 
cence increase at the annealing temperature. Analysis of 
the products of these two amplifications by gel clcetrophe- 
rcnis showed a DNA fragment of the eicpcctcd size for ihe 
male DNA containing sample and no detectable DNA 
synthesis for the control sample. 

DISCUSSION 

Downstream processes such as hybridization to a se- 
quence-specific probe can enhance the specificity of DNA 
deceu-uvii hy PC R* The cUmu»t»Pti of uSeac proccroca. 
means that' the specificity of this homogeneous assay 
depends solely on that of FCR. In the case of sickle-celi 
disease, wc have shown that PGR alone has sufficient DNA 
sequence spedncky to permit genetic screening. Using 
appropriate amplification conditions, there is little non- 
specific production of dsDNA in the absence of the 
appropriate target allele. 

The specificity required to detect pathogens can be 
more or less than that required* to do' gerietic screening, 
depending on the number of pathogens in the sample and 
the amount of other DNA that must be taken with the 
sample. A difficult target is HIV, which requires detection 
of a vind genome that can be at the level of a few copies 
per thousands of host cells*. Compared with genetic 
screening, which is performed on ceils containing at least 
one copy of the target -TCqucttce* HIV ;dete*riqn xequircs 
both more specificity and the input of more total 
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UV photography of PCR tubes containing amptrficauojia 
using EtBr thfit are spedfic to viW-type (A) or nc&c (5> alldea'of 
the human ^globxn genc. The kf t oi each pair of tubes contains 
aBcte'ftpcdfie primers to the wild-type alleles, the rigttt tube 
primers to the skWe aflek. The photottraph was tefcrsb after SO 
cycles of PCR, and the input DMAs and the alleles they contain 
are indicated. Fifty tag of DNA was used to begin FCR, Typing 
was done in triplicate (3 pair* ofPCTts) for each input DNA: 
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FtGOSt S Continuous, rcal-dme monitoring of a PCR. A fiber optic 
was o£cd to carry, excitation light to a PuR m progress abd al$o 
emiUed ligHt back to a flooro meter tece Ex.penmcntal IV^ocol). 
AmptificaOOQ using human malo-DNA specific primers' fn a FC& 
starling with 20 ng of human male DNA (top), or h\ a control 
PCR without DNA Cboitnm), were roonhorcd. Thiny cydej of 
PGR were fallowed for each. The' temperature cycled between 
94*C (denaturation) and 50*C (annealing and extcmioa}. Note in 
4 the male DNA PC^.the cyde (time) depetsdent 
ftuorescence at the autteafin^extcDstan ica>pctaturc 
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DNA~-up to niicrograra amounts-HQ ojrdcr to have suf- 
ficient numbers of target sequences. This large amount of 
/starting DNA m an ampliation sfatuGcftuly increases 
the background fluorescence over which any additional 
fluorescence produced by PCR must be detected. An 
additional complication that occurs with targets in low 
copy-number is the formation of the •"primer-dimer" 
artifact This is the result of the extension of one primer 
using the other primer a$ a template. Although this occurs 
infrequently, once it occurs the extension product is a 
substrate for PCR amplification, and can compete whh 
true PCR targets if those targets are rare. The primfer- 
dimcr product U of course dsDNA and thus is a potential 
source of false signal in this homogeneous assay. 

To increase FCR specificity and reduce the effect of 
primer-dimcr amplification, we are investigating a num- 
ber of approaches, including the use of nested-primer 
amplifications diat take place in a singJc tube 3 , and the 
•"hot-start*, in which nonspecific amplification « reduced 
by raising the temperature of the reaction before DMA 
synthesis begins 2 *. Preliminary results using these ap- 
proaches suggest tbat-prjjncr-diroxrr is effectively reduced 
and it is possible to detect the increase in Etfir fluores- 
cence in a PCR instigated by a -single HIV genome in a 
background of 10 5 cdls. With Larger numbers of cells, the 
background fluorescence contributed by genomic DNA 
becomes problematic. To. reduce this foreground, it may 
be possible to use sequence-specific DNA-binding dyes 
that can be made to preferentially bind PCR product over 
genomic DNA by incorporating the dye-binding DNA 
sequence into the FCR product through a 5' "add'-on" to 
the oligonucleotide pruncr Si . 

We nave shown that the detection of fluorescence 
generated by an EtBr-containing PCR is straightforward, 
both once PCR is completed and continuously during 
thcrrnocycHng. The ease with which automation of spe- 
cific DNA detection can be accomplished is the roost 
prohnisjng aspect of this- -assay. Hie Huoresccnee analysis 
of completed PCRs is alxcadyjpossiblc with existing instru- 
mentation in 96-well format?*. In this format, the fluores- 
cence in each PCR can be quantitated before, after, and 
even at selected points during tiiennocycting by moving 
the rack of PCRs to a 96-microwc!I plate fluorescence 
reader* 6 . 

"The instrumentation necessary to continuously monitor 
multiple PCRs simultaneously is also simple in principle. 
A direct extension of the apparatus used here is to have 
multiple flberoptics transmit the excitation light and flu- 
orescent emissions to and from multiple PCRs. The ability 
to monitor multiple PCRs continuously may allow quan- 
titation of target DNA copy number. Figure 5 shows that 
the larger the amount of starting target DNAj the sooner 
during PtflR a nunrescence increase is detected. Prelimi- 
nary experiments {Higiichi and DolKnger, manuscript in 
preparation) with continuous monitoring have shown a 
sensitivity to two-fold differences in initial target DNA 
concentration:. 

Conversely, if the number of target molecules is 
known — as' it can be in genetic screening— continuous 
monitoring may provide a means of detecting false posi- 
tive and false negative resu)t$. With a known number of 
target molecules, a true positive would exhibit detectable 
fluorescence by a predictable number of cydes of PCR. 
Increases in fluorescence detected before or after that 
cycle would indicate potential artifacts, False negative 
remks due to, for example, .inhibition of DNA polymer- 
ase, may be detected by including within each PCR an 
inefficiently amplifying marker. This marker results in a 
fluorescence increase only after a large number of cy- 
cles— many more' than arc necessary eo detect a true 
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positive. If a sample fails to have a fluorescence increase 
after this many- cycles, inhibition may be suspected. Since, 
in this assay, conclusions are drawn based on the presence 
or absence of fluorescence signal alone, such controls may 
be important In any event before any test based on this 
principle is ready for the clinic; an assessment of ttt false 
positiveftalse negative rates will need to be obtained* using 
a large number of known samples. 

In summary, the inclusion in PCR of dyes whose fluo- 
rescence k enhanced upon binding dsDNA makes it 
possible to detect 5pcri6.c DNA amplification from outside 
the PCR tube. In the future, instruments based upon this 
principle may facilitate the more widespread use of PCR 
in applications that demand the high throughput of 
samples. 

EXPERIMENTAL PROTOCOL 

Human HLA*DQn gene *ttiphBc*do»s containing £t$r. 
PCRs were set up in 100 uJ volumes containing 10 mM Tris^HCL 
pH 8.3; 50 mM KCI; 4 mM MgC^: S3 units of too DNA 
polymerase (P*Htm*£}mcr Genu, Norwalk, CT); 20 praole each 
of human HLA-DQa gene specific oligonucleotide primers 
(VH26 and GHS7 19 and .approximately 10* copies of DQfr PCR 
product diluted from a previous Reaction. Echidium bromide 
(El&r; SignvO */a$ used at tbe ccmcentratioiu indicated in'Figure 
2* Thermocyding proceeded for 20 cycles in a mode! 480 
thcrmocydcr (Ferfcm-Ehner Cem*, Norwalk, CT) ucmfr a "step- 
cycle" prognun qf 94*C tor I mm.dcnatuTaiion and 6<TC fov'So 
sec annealing and 72°G for 30 >ec. extension, 

Y-chromoromc specific PCR. PCRs (ZOO ul total reaction 
volume) containing ndtel EtBr were prepared as described 
for HLA-DQa, except vim different primers and target DNA& 
These PCRs contained 1 $ prooic each male DNA*Apccif!c primm 
VI . 1 and V and cither 60 ng male, 60 ucf enwie, 2 ng male, 
or no human DNA. Thermocycling was &$*C Tor 1 min- and 60?C 
for 1. min using a "step-cycle* program. The number of cycles for 
a sample were as indicated in Figur e 3. Fluorescence measure- 
ment is described below. 

AUetc~apccific> human p-glofein gews PCR. Amplifications of 
100 pi volume iwng 0.5 T^ntl of EtBr were prepared &s 
described for HLA-DQ* above except with different primers and 
target DNAs. These PCR* contained either pnmcr pair HGPt/ 
H£ HA <wiUMype globm spedni primers) or HOPSt/SipUS (siOc- 
le-globin specific primer*) at 10 pmole each primer per PCR, 
These primers wrc developed by Wu ct aL 21 . Three different- 
target »NAa we« tutcd in separate a n\pttfrcauon&— 50 ng each of 
human DNA that was homozygous for the sicVte trait (SS), DNA 
that was heterozygous for the sieWe trak <A$), or DNA that w*u 
homozygous for the w.i- globin (AA). ThcrmDcycfing w«s fWr $0 
cycles at $4 A C for 1 min. and $5*C for 1 min. usbg 9 "step-cycle'' 
program. An annealing temperature of 55°C bAdhcen shown by 
Wtx et al sl to provide aUcJc-epcdfk ampIiKeation, Completed 
PCRs were phertogra^hed through a red ftter {Written 23A) 
after pladbg the reaction tubes amp a model TM-S6 tranfliHiitJfti- 
nator (OV-producta San^Gabriel, CA). 

Fluoresecnce measnrement. Flnomcei>ce measurement were 
mad> on PCRs containing EtBr in a Fluoro!og»2 OuoromCtcr 
CSFEX. Edison, NJ). Excitation was at the 500 nm band with 
about 2 nm bandwidth with a GG 4SS ntn cut-off 6l«rJ(McJIcs 
Grist. Inc.* Irvine. CA) to csctudc scoand-order light. Emitted 
light was detected at 5^0 tim with a bandwidth of about 7 nm- An 
OG 530 urn on-ofF filter was used to remove the excitation Hgnt* 

CoxitiououA ftoorescence gwn u tortttg of PCR, Cominuous 
monitoring or a PCR in progress was atcompJisbed liflin^ daC 
fipectronuorotneter and setdngs descrtbed Above is well as a 
fiberoptic accessory (SfXX cat no. 1 950)* 10 both send excitation 
fight to. aod receive emitted tight from, a PCR placed in a welt of 
a model 4S0 tternvocyckr (FtrkhvElmer Cetus). The probe end 
of the fiberoptic cable was attached with "5 mmjAc-cpoxy" to the 
open top of a PCR tube (a 0.5 ml poiypmpyiene centrifuge tube 
with its cap removed) effectively scaling it. The c^posecTtop of 
ihe PCR tube and the end of the fiberoptic cable were shielded 
from room light and the room tights were kept dimmed during 
- each run. The monitored FOR was an amplmcatton of Y-djro« 
mosome-spedfk: repeat sequences as described above, except 
usmg.an anncalancVextensicn temperature of 50°C. The reaction 
was covered with mineral oil (2 drops) to prevent evaporation. 
TherinQcycung' and' fluorescence mcasuccmcnt were started si' 
multancously, A time-base son with a 10 second inte^raooo' tone 
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wits u«Jd and the «t»te&fott signal was ratioed to* the excitation 
fligrt*! W control fot ch^ge* in Jight-iourcc intensity. Data.wcre 
fleeted using the dna3O0Of, version 2,5 (SFEX) date system. 
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IWMUNO BIOLOGICAL LABORATORIES 



SCD-14 ELISA 

Trauma, Shock and Sepsis 




The CD-14 molecule is' expressed on the surface. of 
monocytes and some macrophages. Membrane- 
bound CD-14 is a receptor for llpopolysaccharide 
(LPS) complexed to LPS-8inding-Protein (LBP). The 
concentration of Its soluble form is anered under 
certain pathological conditions; There, is evidence for 
an important role of $CD-14.with pofytrauma. sepsis, 
burnings and inflammations. 
During septic conditions and acute infections it seems 
(o be a prognostic marker -and Is therefore of vafue in 
monitoring these patients. 



IBL offers an EUSA for quantitative determination of 
soluble CD-14 in human serum, -plasma, cell-culture 
supernatants and other biological fluids. 
Assay features: 12x8 determinations 

(microliter strips), 
precoated with a specific 
rnonoctonal antibody, 
2x1 hour incubation, 
standard range: 3 - 96 ng/ml 
detection limit 1 ng/ml 
CV: intra- and interassay < 

for more information call or fax 
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SIMULTANEOUS AMPLIFICATION AND DETCCT10H 01 
SPECIFIC DKA 




Russell Higuchi*, Gavin Doilinger 1 , P. Sean Walsh and Robert Griffith 

Roche Molecular Systems, Inc.. 1400 55td St., Emeryville, CA 94608. 'Chiron Corporation, 1400 SStd Emeryville, CA 
94608, "^Corresponding author. 



We have enhanced the polymerase chain 
reaction (PGR) such that specific I>NA 
sequences can be detected without open- 
ing the reaction tube. This enhancement 
requires the addition of ethidiumbronride 
(EtBr) to a FCR- Since the fluorescence of 
BtBr increases in die presence of double* 
stranded (ds) DNA an increase in fiuores- 
cence in such a PGR indicates a positive 
amplification, which can be easily mont* 
tored externally. In fact, amplification can 
be continuously monitored in order to 
follow its progress, tile ability to simulta- 
neously amplify specific DNA sequences 
and detect the product of the amplification 
both simplifies and improves PGR and 
may facilitate its automation and more 
widespread use in the clinic or in other 
situations requiri&g high sample through- 
put 

Although the potential benefits of PCR 1 to.cUu- , 
teal diagnostics arc wett knawiy 2 ' 8 , it te &iiU not 
widely used m this setting, even though ** » 
four year* tinco thermostable DMA po^jrm^r* 

ase* 4 made PCR practical. Some of the reasons for it* slow, 
acceptance are high cost, Jaek of automation of pre- and 
posc-PCR processing steps, and false positive results, from 
carryover-contamination. The first two points are related 
in that labor is the largest contributor to cOstat the present 
stage of PCR dcvelopmeau Most Current assays require 
wipe form of "downstream" processing once thermocy* 
ding is done in order to determine whether the target 
DNA sequence was- present and has amplified, TheAe 
include DNA hybridation** gel electrophoresis with or 
without use of restriction digestion^ HPLC*, or capillary 
eleorophorcsU 10 . These methods are labor intense, have, 
low throughput, and arc difficult to automate. The third 
point is abo cloudy related to downstream processing. 
The handling of the PCR product in these downstream 
processes increases the chances that amplified DNA* will 
spread through the typing- lab, resulting in a .risk of 



"carryover" fMse positives in subsequent testing 11 . 

These downstream processing steps would be elimi- 
nated if specific amplification and detection of amplified 
ONA took place simultaneously vytihin an unopened re- 
action vessel Assays in which such different processes take 
place without. the need to separate reaction components 
have been termed ^homogeneous*. No truly homogc-. 
tieous PCR assay has been demonstrated to date, although 
progress towards this end has been reported. Chefoab, et 
al. l % developed a PCR product detection scheme iunng 
fluorescent primers that resulted in a fluorescent PCR 
product Allcfc-spedfic primers, each with different fluo- 
rescent tags, were used to indicate- the genotype of the 
DNA. However, the urjincorporated primers tnust still be 
removed in a downstream process in order to visualize the 
result Recently, Holland, et al> 15 , developed an assay in 
wrudi the endogenous 5 r exonuclease assay of 7<*f "DNA 
polymerase was expbitcd to cleave a labeled oligonucleo- 
lide probe. The probe would only dcave if PGR amplifi- 
cation, had produced its complementary sequence. In 
order to detect the cleavage products, however, a subse- 
quent process w again needed. 

We have developed a truly homogeneous assay for PCR 
and PCR product detection based upon the gready in- 
creased fluorescence that ethidium btoinide and other 
DNA binding dyes exhibit when they are bound .to.ds- 
jmA X4 ^ XQ . As outhned in Figure \, a prototype PCR 
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IIOVftE 1 Prindplc of simultaneous amplification and detecitOjfli of 
PCR produce The component* of a PCR containing EtBr thai are 
fluorescent are listed— EtBr jfctselfi EtBr bound to other ssDNA ot ' 
cUDN A, There is a large nuorescencc cohnncrtnertt when EtBr is 
bound to DNA and hmdiii> is greatly enhanced when DNA .is 
double-stranded. After sumodit (n). cydcs of PGR* the.net 
increase in -dipl^A residts in additional £tBr binding and a net 
increase in total 'Auoxcsccncei 
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JWmaGeldccttoplwresisof PCIlaTOpiification prodiirts of the 
biuma, nudear gene, HLA DQtx, made In the presence of 
£CT*sh*g amounts of EtBr (up to 8 ftg/tnl). Tpc presence of 
£t»r has no obvious effect on the yjeid or spedf5dty of amplifi- 
cation. 
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begins with primers that are single-stranded DNA (ss- 
DNA), dNTFs, and DNA polymerase; An amount of 
dsDNA containing the target sequence (target DNA) is 
also typically present This amount can vary, depending 
on the application, from single-ceil amounts of DNA iT to 
micrograms per PCR^, If EtBr is present, the reagents 
that wiU fluoresce, in order of increasing fluorescence, are 
free EtBr hsclfi and EtBr bound to the single-stranded 
DNA primers and to the double-stranded target DNA (by 
its intercalation between the stacked bases of the DNA 
dooblc-hcfix). After the first denatu ration eyde, target 
DNA will be largely singie-stranded, After a PGR is 
completed, the most significant change is the increase in 
the amount of dsDNA (the PCR* product itself) of up to 
several micrpgrams. Formerly free EtBr is bound to the 
additional dsDNA» resulting in an increase in fluores- 
cence. There is also some decrease in the amount of 
ssDNA primer! but because the binding of EtBr to ssDNA 
is much less than to dsDNA, the effect of tbb change on 
the total fluorescence of the sample is smalL Tbe fluores- 
cence increase can be measured by directing cxdtadoa 
iUumination through the walls of the amplification vessel 
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RGUkf % (A) Fluorescence measurements 6x>m PCRs that contain 
"0.5 ptftriH EtBr and that are specific for V*d^mosojDac repeat 
Mooetiooi, Five replicate PCRs were begun containing each of the 1 

SJ^*^** 5 ^?' At pac ^ "^^^ cycle, one of the five replkatc 
PCRs for each DNA -was removed from thcrmocydmg and tts 
fluorescence measured, Unit* of fluorescence are arbitrary. (B) 
UV photography of PCR tuoet (0,5 ml Eppcndorf*tylc, pctypnV 
pylcne mto^cewtrifugcitubes) conuuung reactions, those st&n» 
mg from 2 ng male DNA and control reactions without any DNA* 
from (A>, ' 



before and after, or even eoniinuously during, chermocy. 
RESULTS 

PCR in the presence of EtBr. fa order to assess the 
affect of EtBr in FOR, amplifications of the human Hi A 
DQct gene >9 were performed with the dye present at 
concentrations from 0,06 to 8.0 tig/ml (a typka} concen 
tration of EtBr used tn staining of nucleic acid* foltowin» 
gd electrophoresis ts 0*5 u.gj'rrjf). As shown in Figure 2, m 
electroohoTcsis revealed little or no difference in the yield 
or quality of the ampUffcation product whether' EtBr was 
absent or present at any of these concentrations, indicate 
ia& tfetE tBr does not tnhibk PGR, 

Detection of human Y-dttomosorao specific 
cpencest Sequence-specific., fluorescence enhancement of 
EtBr as a result of PGR was demonstrated in a scenes of 
amplifications comabtiing 0,5 ugfaif EtBr and priners 
specific to repeat DNA sequence* found on the human 
Y-chromo^omo 20 . These PCRs initially contained cither 
60 ng male, 60 ng female, 2 ng roate human or no DNA. 
Five replkatc ?CRs were begun for each DNA. After 0, 
17, 21 , 24 and 29 cycles of therniocycling, a PGR for each 
DNA was removed from the thercnoeyeler, and fluo- 
rescence measured in a spcctroSnoronieter and' plotted 
vs. amplification cycle number (Fig. 3A). liic shape of this 
curve rcBccts the fact that by the time an increase in 
fluorescence can be detected, the increase in DNA i$ 
becoming linear and not exponential with cycle number: 
As shown, the fluorescence increased about three-fold 
over the background fluorescence for the FCRs Contain- 
ing human male DNA, but did not significantly increase 
for negative control FCRs, which contained either no 
DNA or huthan female DNA. The more male DNA 
present to begin with — 60 ng versus 2 ng--*he fewer 
cycles were needed to give a detectable increase in fluo- 
rescence. Gel dectotrohoresis oo the products of these 
amplifications showed that DNA fragments of the ex- 
pected size were made in the male DNA containing 
reactions and that Utile DN A synthesis took place in the 
control samples. 

In addition, the increase in. fluorescence was visualized 
by simply laying the completed, unopened PCRs on a UV 
transuluminator and photographing them through a red 
filter. This is shown in figure SB lor the reactions thai 
began with 2 ng male DNA and those with no DNA. 

Detection of specific allele* of the human p-gtobin 
gene. In order to .demonstrate that this approach has 
adequate specificity to allow genetic screening, a detection 
of the' SKkfc-ccll anemia mutation was performed* Figure 
4 shows the ftuotescencc from completed amplifications 

corttaiximg EtBr (O.S ixgfaiJ) a* det*eu»a by photography 

of the reaction tubes on a UV rransilluminator. These 
reactions wei*e performed using primers specific for ci* 
ther the wild-type or sickle-cell mutation of the human 
^globin gene". The specificity for each aBcte ts imparted 
by placing the siclde-mutation site at the terminal 3' 
nucleotide of one primer. By using an appropriate primer 
annealing temperature, primer extension — and thus «m> 
pUncatipf* — can take place only if the 5' nucleotide of the 
pnmer H compkincptary to the $«^obin allele prtyutnt^' 22 - 
Each^air Or ampimcations shown in Figure 4 consists of 
a reaction with either the witokypc allele specific (left 
tube) or skklc-aUeie specific (right tube) primers. Three 
different DNAs. were typed: DNA from a homozygous, 
"Wild-type p-globin mdivjdual (AA); from a heterozygous 
skkle p^gipbin individual (AS); and from a homozygous 
sickle p-gtobio bdrrtdual (SS). Each DNA (50 ng genomic 
DNA to start each PGR) was Analyzed m tripKcatc (3 pairs 
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( ,f reactions each). The DNA .type, -was reflected in the 
^ative fluorescence intensities in each pair of completed 
ginpl^ 0511 ' 01 * 6 * There was a significant increase in fluores- 
ce* only where a (^lobin allele DNA matched the 
primer set. When measured • oa a spectroflu urometer 
(data not shown), this fluorescence was about three times 
present in a PCR where both frglobm alleles were 
jubroatchcd to the tmmer set. Gel clcttrophorestt (aot 
phowti) established that this increase in fluorescence was 
due to the synthesis of nearly a microgram of a DNA 
fragment of the expected size for p-globin. There was 
little synthesis of dsDNA in reactions in which the allele" 
jipcdfic primer was mismatched to both alleles* 

Continuous monitoring of a PGSL Using a fiber optic 
devker K i* possible to direct excitation illumination from 
r i spectrofluorometer to a PGR undergoing themocydiog 
and to return its fluorescence to the KpectroftuorowctCT. 
flic fhiorcsccncc readout of such an arrangement, di- 
rected Rt an EtBr-containing amplification of Y-chroroo- 
M>mc specific sequences from 25 rig of human male DNA, 
;* shown in Figure 5. The readout from a control r*CR 
villi no target DNA is also shown. Thirty cycles of PCR 
were monitored for each. 

The fluorescence? trace as a function of time dearly 
shows the effect of the therm ocyding. Fluorescence inten- 
sity rises* and. falls inversely with temperature. The fluo- 
rescence intensity is minimum at the denaturation tem- 
perature (£M°C) and maximum at the anneafr ^extension 
temperature (50°C}. In the negative-control FCR> these 
fluorescence maxima and minima do not change signifi- 
cantly over the thirty tbcrmocyele*, indicating that there is 
tittle dsDNA synthesis without the appropriate target 
DNA, and there is little if any bleaching of EtBr during 
the continuous ilhiTninacion of the sample. 

In the PCR containing male DNA, the fluorescence 
maxima at the ann ealin g/extension temperature begin to 
increase al about 4000 seconds of thennocyding, and 
continue to increase with time, indicating that dsDNA is 
being produced at a detectable JeveL Note that the fluo- 
rescence minima at the denaturaubn temperature do not 
Aigrufteandy increase, presumably because ai thh temper- 
ature there is no dsDNA for EtBr to bind. Thus the course 
of the amplification is followed by tracking the fluores*. 
cence increase at the aojicaKn^ temperature. Analysis of 
the products of these two amplifications by gel electropho- 
resis showed * DNA fragment of the efcpectcd size for the 
male DNA containing sample and no detectable DNA 
synthesis for the control- sample* 

DISCUSSION 

Downstream processes such as hybridization to a se- 
quence-specific probe can enhance the specificity of DNA 
deteuuVit by FGR>» The chmuxmcm <j€ thcac process 
means that' the specificity of this homogeneous assay 
depends solely on that of PCSL In the case of sickle-cell 
disease, wc have shown that PGR abac has sufficient DNA 
sequence specificity to permit genetic screening* Using 
appropriate amplification conditions, there is little non- 
specific production of dsDNA in the- absence of the 
appropriate target allele. 

rfcc spedficity required to dcccct pathogens can be 
more or less than that required' to do genetic screening, 
dependi ng on the number of pathogens in the sample and 
the amount of other DNA that must be taken with the 
sample. A difficult target is HIV, which requires detection 
of a viral genome that can be at the level of a few copies 
per thousands of host cells*. Comnated with genetic 
screening, which is performed on ceils containing at least 
one copy of die target sequence* HiV ioetecdon requires 
Wh more specificity and the input of more total 
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WW A UV photography of PCR tubes containing funptifics&iojis 
using EtBr mat art specific to wtd-tjpe (A) or sidwe (5) alldcs cf 
the miman ^globin gene. The left of each pair of tubes contains 
aBek»spcdfie primers to the wild-type alleles, the right tube 
primers to the sickle allele. The photograph was tafch After SO 
cycles of PCR, fttid the input DNAs iuto the alleles thev contain 
sre bdic3ted- F^fty ng of DNA was used to begin PGPL Typing 
was done in triplicate (3 pairs of PC&) tor each input DNA: 
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FfftflS S Contittoous, real-time monitoring of a PCR. A fiber optic 
wasoscd to cxdution light to a 1>UR in progress abd aUo 
emiUfid tig^it back to a fluoromctcr (see Expenmcntal Fro^oeol). 
AmptificaUon-wiag human malo-DNA specific primcn'm a PCR. 
starting with 20 ng of human male DNA (top), or in a control 
PCR without PNA. (bottom), were roonhorcd. Thirty cydes of 
PCR were foflowed for each. The' temperature cycled between 
94*C {deiuturatiom) and 50*C (annealing and -extension). Note m 
the male DNA PCR,. the cycle (dmc) deptflrfcot fearasc in" 
miorescence dt the antieaBn^extenaion temperature, 
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1 DNA*— up to microgram amounte— -in order to have suf- 
ficient number* of target sequences. This large amount of 
starting DNA m an amplication staittendy increases 
the background fluorescence over which any additional 
fluorescence produced by PCR must be detected. An 
additional complication that occur* wkh targets in tow 
copy-number fs the formation of the "primer-dimer'' 
artifact. This is the result of the extension of one primer 
using the other prixner a* a template Although this occurs 
infrequently, once it occurs the extension product is a 
substrate for PGR amplification, dad can compete with 
true PCR targets if those targets are rare. The primer- 
dimer product is of course dsDNA and thus is a potential 
source of false signal in this homogeneous assay, 

To increase FCR specificity ana reduce the effect of 
primer* dimcr amplification, we are investigating a num- 
ber of approaches, including the use of nested^primer 
amplifications that take place in a single tube*, and the 
"hot-start'\ in which nonspecific amplification is reduced 
by raising the temperature of the reaction before DNA 
synthesis begins 25 . Preliminary results using these ap- 
proaches suggest tbatpriincr-dimeT is effectively reduced 
and it is possible to delect the increase in Etfir fluores- 
cence in a PCR instigated by a single HIV genome in a 
background of 10* ceils. With larger numbers of cells, the 
background fluorescence contributed by genomic DNA 
becomes problematic. To, reduce this background, it m*y 
be possible to use sequertce^pecific 2>NA-binding dyes 
that can be made to preferentially bind PCR product over 
genomic DNA by incorporating the dye-binding DNA 
sequence into the PCR product through a 5' "add-on" to 
the oligonucleotide primer 8 *. 

We have shown that the detection of fluorescence 
generated by an EtBr-containing PCR is straightforward, 
both once PCR is completed and continuously during 
therrnocycKng. The ease with which automation of spe- 
cific DNA detection can be accomplished is the most 
promising aspect of this assay. The fluorescence analysis 
of completed PCRs is alrcadyjiossiblc with existing instru- 
mentation in 96-weli format?*. In this format, the fluores- 
cence in each PCR can be cjuantitated before* after, and 
even at selected points during thermocyciing by moving 
the rack of PCRs to a ^rnierowcll plate fluorescence 
reader 46 . 

The instrumentation necessary to continuously monitor 
multiple PCRs simultaneously is also simple in principle. 
A direct extension of the apparatus used here is to have 
multiple fiberoptic* transmit the excitation light and flu- 
orescent emissions to and from multiple PCRs. The ability 
to monitor multiple PCRs continuously may allow quan- 
titation of target UNA copy number. Figure 3 shows that 
the larger the amount of starting target DNA, the sooner 
dii Kttg PCJt a fluorescence increase is detected. Prelimi- 
nary experiments {Higuehi and Bollinger, manuscript in 
preparation) with continuous monitoring' have shown a 
$cositmry to two-fold differences in initial target DNA 
concentration. 

Conversely, if the number of target molecules is 
known — as' it can be in genetic screening— continuous 
monitoring may provide a means pf detecting false posi- 
tive and false negative results. With a known number of 
target molecules, a true positive would exhibit detectable 
fluorescence by a predictable number of cydes of PCR 
Increases in fluorescence detected before or after that 
cycle would indicate potential artifacts. False negative 
results due to, for example,. inhibition of DNA polymer- 
ase, may be detected by including within each PCR an 
inefficiently amplifying marker. This marker results in a 
fluorescence increase only after a large number of cy- 
cles — many more' than arc necessary to deceet a true 
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positive. If a sampJe fails to have a fluorescence increase 
after this many cycles, inhibition may be suspected. Since, 
in this assay, conclusions are drawn based on the presence 
or absence of fluorescence signal alone, such controls may 
be importont. In any event before any test based on this 
principle is ready for the clink, an assessment of it* false 
positive/false negative rates will need to be obtained using 
a large number of known samples. 

In summary, the inclusion tn PCR of dyes whose fluo- 
rescence is enhanced upon binding dsDNA makes it 
possible to detect speciuc DNA amplification from outside 
the PCR tube. In the future, instruments based upon this 
principle may facilitate the more widespread use of PCR 
in applications that demand the high throughput of 
samples* 

EXPERIMENTAL PROTOCOL 

Human HLA-DOm geae *mpli£eaao&s containing EtRr. 
PCRs were set up ittlOO ul volumes containing 10 mM Tris-HQ 
pH 8.8; 50 mM KCIi 4 mM MgO t : S3, units of taq DNA 
polymerase (FerlciiwEhttcr Cera*. Norwalk, CT); 20 pmole eacU 
of human KtA-DQa gcnc specific oligonucleotide primers 
(>H$6 and GH27 19 and. approximately 10* copies of DQti PCft 
product diluted from a previous inaction, gthidiuin bromide 
(EtRr; Sigm.«0 was used at the concentrations indicated in Figure 
2. Thermocyding proceeded for 20 cycles in a model 480 
thcrHiocyder (Per tin-Elmer Cotua, NorwaWt, CT) using a "step- 
cycJc" program of 94*C for 1 ram-dcnauiraUon And 6(TG ftp $0 
sec attncalxng and 72*0 for 30 sec, extension. 

Y-chromoromc specific PCR* PCRs ()t00 pi total reaction 
volume) containing tU» pc/ml EtBr were prepared as described 
For HLA-DQo, except with different primers and target DNAs. 
These PCRs combined ? $ pmotc each male DN A-*pccific printers 
YI.l and Vl.2*°, and cither 60 ng male, GO ng female, 2 ng male, 
or no human J5NA. ThermOcycling wasM 6 CTor 1 min- and 60?C 
for 1 mm using a "step-cycle* ptomm. The number of cycles for 
a sample were as indicated in Figure 3. fluorescence measure- 
ment is described below. 

Andc-apccific, human f^gloWn PCR, AropUncarions of 
100 fj volume using 05 MgAnl of £iBr were prepared a$ 
described for HLA^DQot above except with diSerenl primers* and 
target DNAs. These PCIte -contained either primer pair HGP2/ 
H0 HA {wRd-type g*obm specutc primer*) or HGH!/H|n4S (skk- 
lc-globin spedhc primers) at 10 pmole eAch primer per PCR, 
Theseprimcrs were developed by Wu ct aL 21 . Three different 
tatgei DNA* were tued b separate amptificationsr-50 ng each of 
human DNA that was homozygous for the sickle trait (SS), DMA 
that was heterozygous for the sickle Watt <A$>, or DNA that was 
homozygous for the W.c ^Jobin (AA). ThcTmDeycfirtg was for SO 
cycles at 9tft3 for 1 mm. and 55 ft C for 1 mm. itst»| si , ^qu^-cytl* ,, 
program. An annealing Umpcratuxe of 55*C b*dl>ccn shown by 
Wa et al, 2J to provide allclc^pcofk atppn^cation. Completed 
PCRs were phenographed through a red filter (Written 23 A) 
after placing the reaction tabes atop a model TM-36 trflnsiHiUtki- 
natof (UV-prodvtctt Sah'Gcibriel, CA)w 

FlxKiresecncc measurement. Ftwore*ce»>ce racasurcmcnuH were 
made on PCRs containing EtBr in a Fluorolog*2 fliioromCtcr 
(SPEX Edison, NJ). Excitation was at the 500 nrn band with 
•ateut 2 ntn bandwidth with a OG 435 nro cut-off filter jMcJlcs 
Grist, Inc.* Irvine. CA> to exclude second-order light. Emitted 
light was detected at 5 70 nm with a bandwidth of about 7 nrn* An 
OG 530 nm cut-off Biter was used to remove the excitation tight 

ContimtOOA ft&orescence iiK HJu <o i 'j l u g of PCR. Continuous 
monitoring of a PCR in progress was accompKsbed using the 
Bpcctrufluo romctcr and setnnga descrrbod Above as well as a 
fiberoptic accessory (SP£X cat no. 1950)- to both send excitation 
light to, and receive emitted light from, a PCR placed m a well of 
a model 480 rhermocyclcr (FcrkhvElmer Cetus). The probe end 
of the Fiberoptic cable was attached with **5 ow.utc-cpoxy" to th« 




from room light and the room light* kctc kept dimmed during 
each run- The monitored PCR was an amplification of V-dbio- 
mosorne-fipedne repeat sequences as described above, except 
using\an anncaimg/extensien wnrocrauifc of 50°C The rcacaon 
was covered with mineral oil (2 drops) to prevent evaporation- 
Therraocydin^ and- fluorescence rocasuccment were started s\" 
multancousty, A time-base son witn a 10 second Hitcgradoo' tbnc 
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wax u«sd and the emfeuoa signal was rarioed to' tbc excitatiou 
fiigrtJtl to cpmLrol foe ch?t)jp» in )teht-*ource intciurity. tfcta.wcrc 
^iwi^d usiagthe dra3OO0f> version 15 (SPEX) data system- 

We dwwfc Bob Jooca for help with rju* spectrofluormctric 
(PdiudrementSAnd JHcalherbelJ Fonyfor edhSog thii manuscript. 
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IMMUNO BIOLOGICAL LABORATORIES 



sCD-14 EUSA 



Trauma, Shock and Sepsis 




The CD-14 molecule is' expressed on the surface of 
monocytes and some macrophages. Membrane- 
bound CD-14 is a receptor tor lipopolysaccharide 
(LPS) complexed to LPS-8inding-Protein (LBP). The 
concenirailon of soluble form is aftered under 
certain patlTologicai conditions. There, is evidence for 
an important note of $CD-14.vvith pofytrauma. sepsis, 
burnings and inflammations. 
During septic conditions and acute infections il seems 
to be a prognostic marker -and is therefore of value in 
monftorlng these patients. 



IBL offers an ELISA for quantitative determination o( 
soluble CD-14 in human serum, -plasma, cell-culture 
supernatants and other biological fluids. 
Assay features; 12x8 determinations 

(microliter strips), 

precoated with a specific 

monoctonal antibody, 

2x1 hour incubation, 

standard range: 3-96 ng/mi 

detection limit: 1 ng/m! 

C V: intra- and interassay < 8% 
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For more information cali or fax 
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Oligonucleotides with Fluorescent Dyes at 
Opposite Ends Provide a Quenched Probe 
System Useful for Detecting PCR Product 
and Nucleic Acid Hybridization 

Kenneth J. Livak, Susan JA Flood, Jeffrey Marmaro, William Giusti, and Karin Deetz 

Perkin-Elmer, Applied Biosystems Division, Foster City, California 94404 



The 5' nucleate PCR assay detects the 
accumulation of specific PCR product 
by hybridization and cleavage of a 
double-labeled fluorogenlc probe 
during the amplification reaction. 
The probe Is an oligonucleotide with 
both a reporter fluorescent dye and a 
quencher dye attached. An Increase 
In reporter fluorescence intensity In- 
dicates that the probe has hybridized 
to the target PCR product and has 
been cleaved by the 5'— >y nucle- 
olytlc activity of Taq DNA polymerase. 
In this study, probes with the 
quencher dye attached to an internal 
nucleotide were compared with 
probes with the quencher dye at- 
tached to the 3 '-end nucleotide. In all 
cases, the reporter dye was attached 
to the 5' end. All intact probes 
showed quenching of the reporter 
fluorescence* In general, probes with 
the quencher dye attached to the 3'- 
end nucleotide exhibited a larger sig- 
nal in the 5' nuclease PCR assay than 
the Internally labeled probes. It Is 
proposed that , the larger signal is 
caused by Increased likelihood of 
cleavage by Taq DNA polymerase 
when the probe Is hybridized to a 
template strand during PCR. Probes 
with the quencher dye attached to 
the 3 '-end nucleotide also exhibited 
an Increase In reporter fluorescence 
Intensity when hybridized to a com- 
plementary strand. Thus, oligonucle- 
otides with reporter and quencher 
dyes attached at opposite ends can 
be used as homogeneous hybridiza- 
tion probes. 



A 



homogeneous assay for detecting 
the accumulation of specific PCR prod- 
uct that uses a double-labeled fluoro- 
genic probe was described by Lee et al. (1) 
The assay exploits the 5'->3' nucle- 
olytic activity of Taq DNA poly- 
merase^ 31 and Is diagramed in Figure l. 
The fluorogenic probe consists of an oli- 
gonucleotide with a reporter fluorescent 
dye, such as a fluorescein, attached to 
the 5' end; and a quencher dye, such as a 
rhodamine, attached internally. When 
the fluorescein is excited by irradiation, 
its fluorescent emission will be 
quenched if the rhodamine is close 
enough to be excited through the pro- 
cess of fluorescence energy transfer 
(FET). (4 ' 5) During PCR, if the probe is hy- 
bridized to a template strand, Taq DNA 
polymerase will cleave the probe be- 
cause of its inherent 5' -* 3' nucleolytic 
activity. If the cleavage occurs between 
the fluorescein and rhodamine dyes, it 
causes an increase in fluorescein fluores- 
cence Intensity because the fluorescein 
is no longer quenched. The increase in 
fluorescein fluorescence intensity indi- 
cates that the probe-specific PCR product 
has been generated. Thus, FET between a 
reporter dye and a quencher dye is criti- 
cal to the performance of the probe in 
the 5' nuclease PCR assay. 

Quenching is completely dependent 
on the physical proximity of the two 
dyes. (6) Because of this, it has been as- 
sumed that the quencher dye must be 
attached near the 5' end. Surprisingly, 
we have found that attaching a rho- 
damine dye at the 3' end of a probe 
still provides adequate quenching for 
the probe to perform in the 5' nuclease 



PCR assay. Furthermore, cleavage of this 
type of probe is not required to achieve 
some reduction in quenching. Oligonu- 
cleotides with a reporter dye on the 5' 
end and a quencher dye on the 3'. end 
exhibit a much higher reporter fluores- 
cence when double-stranded as com- 
pared with single-stranded. This should 
make it possible to use this type of dou- 
ble-labeled probe for homogeneous de- 
tection of nucleic acid hybridization. 

MATERIALS AND METHODS 
Oligonucleotides 

Table 1 shows the nucleotide sequence 
of the oligonucleotides used in this 
study. Linker arm nucleotide (LAN) 
phosphoramidite was obtained from 
Glen Research. The standard DNA phos- 
phoramidites, 6-carboxyfluorescein (6- 
FAM) phosphoramidite, 6-carboxytet- 
ramethylrhodamine succinimidyl ester 
(TAMRA NHS ester), and Phosphalink 
for attaching a 3'-biocking phosphate, 
were obtained from Perkin-Elmer, Ap- 
plied Biosystems Division. Oligonucle- 
otide synthesis was performed using an 
ABI model 394 DNA synthesiser (Applied 
Biosystems). Primer and complement 
oligonucleotides were purified using 
Oligo Purification Cartridges (Applied 
Biosystems). Double-labeled probes were 
synthesized with 6-FAM-labeIed phos- 
phoramidite at the 5' end, LAN replacing 
one of the T's in the sequence, and Phos- 
phalink at the 3' end. Following de- 
protection and ethanol precipitation, 
TAMRA NHS ester was coupled to the 
LAN<ontaining oligonucleotide in 2S0 
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Polymerization 
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FIGURE 1 Diagram of 5' nuclease assay. Stepwise representation of the 5'- 3' nucleorytic ac- 
tivity of Taq DNA polymerase acting oft a fluorogenic probe during one extension phase of PGR. 



nvM Na-bicarbonate buffer (pH 9.0) at 
room temperature. Unreacted dye was 
removed by passage over a PD-10 Sepha- 
dex column. Finally, the double-labeled 
probe was purified by preparative high- 
performance liquid chromatography 
(HPLC) using an Aquapore C B 220*4.6- 
mrn column with 7-urn particle size. The 
column was developed with a 24-min 
linear gradient of 8-20% acetonitrile in 
0.1 m TEAA (triethylaTnine acetate). 
Probes are named by designating the se- 
quence from Table 1 and the position of 
the LAN-TAMRA moiety. For example, 
probe Al-7 has sequence Al with LAN- 
TAMRA at nucleotide position 7 from the 
S' end. 



PCR Systems 

All PCR amplifications were performed 
In the Perkin-Elmer GeneAmp PCR Sys- 
tem 9600 using 50-u,l reactions that con- 
tained 10 roM Tris-HO (pH 8.3), 50 him 
KC1, 200 \>M dATP, 200 \lm dCTP, 200 um 

dGTP, 400 m-M dUTP ' 0 5 of ^P^' 
ase uracil N-giycosylase (Perkin-Elmer), 
and 1.25 unit of AmpliTaq DNA poly- 
merase (Perkin-Elmer). A 295-bp seg- 
ment from exon 3 of the human p-actin 



gene (nucleotides 2141-2435 in the se- 
quence of Nakajima-Iijima et al.) {7 > was 
amplified using primers AFP and ARP 
(Table 1), which are modified slightly 
from those of du Breuil et al. (B) Actin am- 
plification reactions contained 4 mM 
MgC! 2 , 20 ng of human genomic DNA, 
50 nM Al or A3 probe, and 300 nM each 



TABLE 1 Sequences of Oligonucleotides 
Name Type 



primer. "Die thermal regimen was SO^C 
(2 min), 95°C (10 min), 40 cycles of 95°C 
(20 sec) r 60°C (1 rain), and hold at 72°C. 
A 515-bp segment was amplified from a 
plasmid that consists of a segment of X 
DNA (nucleotides 32,220-32,747) in- 
serted in the Smal site of vector pUC119. 
These reactions contained 3.5 mM 
MgClz, 1 ng of plasmid DNA, 50 nM P2 or 
PS probe, 200 nM primer PI 19, and 200 
nM primer R119: The thermal regimen 
was 50°C (2 min), 95°C (10 min), 25 cy- 
cles of 95°C (20 sec), 57°C (I min), and 
hold at 72*C. 



Fluorescence Detection 

For each amplification reaction, a 40-^1 
aliquot of a sample was transferred to an 
individual well of a white; 96-well mlcro- 
titer plate (Perkin-Elmer). Fluorescence 
was measured on the Perkin-Elmer Taq- 
Man LS-50B System, which consists of a 
luminescence spectrometer with plate 
reader assembly, a 485-nrn excitation fil- 
ter, and a 515-nm emission filter. Excita- 
tion was at 488 nm using a 5-nm slit 
width. Emission was measured at 518 
nm for 6-FAM (the reporter or R value) 
and 582 nm for TAMRA (the quencher or 
Q. value) using a 10-nm slit width. To 
determine the increase in reporter emis- 
sion that is caused by cleavage of the 
probe during PCR, three normalizations 
are applied to the raw emission data. 
First, emission intensity of a buffer blank 
is subtracted for each wavelength. Sec- 
ond, emission intensity of the reporter is 



Sequence 



F119 

R119 

P2 

P2C 

P5 

P5C 

AFP 

ARP 

Al 

A1C 

A3 

A3C 



primer 
primer 
probe 

complement 
probe 

complement 
primer 
primer 
probe 

complement 
probe 

complement 



ACCCACAGGAACTGATCACCACTC 

ATGTCGCGTrCCGGCTGACGTTCTGC 

TCGCATTACrGATCGTfGCCAACCAGTp 
GTACTGGTTGGCAACGATCAGTAATGCGATG 

CGGA'lTTGCTGGTATCTATGACAAGGATp 
TTCATCCTFGTCATAGATACCAGCAAATCCG 

TGACCCACACTGTGCCCATCTACGA 

CAGCGGAACCGC1XZATTGCCAATGG 

ATGCXCTCCCCC^TGCCATCCTGCGTp 
AGACGCAGGATGGCATGGGGGAGGGCATAC 

CGCCCrGGACTTCGAGCAAGAGATp 
CCATCTCTTGCTCGAAGTCCAGGGCGAC 



For each oligonucleotide used in this study, the nucleic add sequence is given, written in the 
5' -> y direction. There are three types of oligonucleotides: PCR primer, fluorogenic probe used 
in the S' nuclease assay, and complement used to hybridize to the corresponding probe. For the 
probes, the underlined base indicates a position where IAN with TAMRA attached was substi- 
tuted for a T. (p) The presence of a 3' phosphate on each probe. 
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Al-2 
A1-7 
AM 4 
AM 9 
A1-22 
A1-26 



RAQGCCCTCCCCCATGCCATCCTGCGTp 
RATGCCCQCCCCCATGCCATCCTGCGTp 
lUTGCCCTCCCCCAQGCCATCCTGCGTp 
RATOCCCTCCOrCATGCCAQCCTGCGTp 
RATGCCCTCCCCCATGCCATCCQGOGTp 
RATGCCCTCCCCCATGCCATCCTGCGQp 



Probe 


518 nm 


582 nm 


RQ- 


RQ + 


ARQ 




no temp. 


+ temp. 


no temp. 


+ temp. 








A1-2 


25.5 ±2.1 


32.7 ±1.9 


38.2 + 3.0 


38.2 ±2.0 


0.67 + 0.01 


0.86 ±0.06 


0.19 ±0.06 


A1-7 


. 53.5 ±4.3 


395.1 ±2.1.4 


106.5 ±6.3 


110.3 ±5.3 


0.49 + 0.03 


3.58+0.17 


3.09 + 0.18 


A1-14 


127.0±4.9 


403.5 ±19.1 


10B.7±5.3 


93.1 ±6.3 


1.16 ±0.02 


4.34 ±0.15 


3.18±0.15 


A1-19 


187.5 ±17.9 


422.7 ±7.7 


70.3 ±7.4 


73.0 ±2.8 


267 ±0.05 


S.80±O.15 


3.13 ±0.1 6 


A 1*22 


224.6 ±9.4 


462.2 ± 43.6 


100.0 + 4.0 


96.2 ±9.6 


2.25 ±0.03 


5.02±0.11 


2.77 ±0.12 


A1-26 


160.2 ±8.9 


454.1 ±18.4 


93.1 ±5.4 


907 ±3.2 


1.72 ±0.02 


5.01 ±0.08 


3.29 ±0.08 



FIGURE 2 Results of 5' nuclease assay comparing p-actin probes with TAMRA at different nucle- 
otide positions. As described in Materials and Methods, PCR amplifications containing the in- 
dicated probes were performed and the fluorescence emission was measured at 518 and 582 nm. 
Reported values are the average ±1 s.D. for six reactions run without added template (no temp.) 
and six reactions run with template (+temp.). The RQ ratio was calculated for each individual 
reaction and averaged to give the reported RQ - and RQ + values. 



divided by the emission intensity of the 
quencher to give an RQ ratio for each 
reaction tube. This normalizes for well- 
to-well variations in probe concentra- 
tion and fluorescence measurement. Fi- 
nally, ARQ is calculated by subtracting 
the RQ value of the no-template control 
(RQ~) from the RQ value for the com- 
plete reaction including template 

(RQ*). 
RESULTS 

A series of probes with increasing dis- 
tances between the fluorescein reporter 
and rhodamine quencher were tested to 
investigate the minimum and maximum 
spacing that would give an acceptable 
performance in the 5' nuclease PCR as- 
say. These probes hybridize to a target 



sequence in the human (3-actin gene. 
Figure 2 shows the results of an experi- 
ment in which these probes were in- 
cluded in PCR that amplified a segment 
of the p-actin gene containing the target 
sequence. Performance In the 5 r nu- 
clease PCR assay is monitored by the 
magnitude of ARQ, which is a measure 
of the increase in reporter fluorescence 
caused by PCR amplification of the 
probe target. Probe Al-2 has a ARQ value 
that is close to zero, indicating that the 
probe was not cleaved appreciably dur- 
ing the amplification reaction. This sug- 
gests that with the quencher dye on the 
second nucleotide from the 5' end, there 
is insufficient room for Taq polymerase 
to cleave efficiently between the reporter 
and quencher. The other five probes ex- 
hibited comparable ARQ values that are 



clearly different from zero. Thus, ail five 
probes are being cleaved during PCR am- 
plification resulting in a similar increase 
in reporter fluorescence. It should be 
noted that complete digestion of a probe 
produces a much larger increase in re- 
porter fluorescence than that observed 
in Figure 2 (data not shown). Thus, even 
in reactions where amplification occurs, 
the majority of probe molecules remain 
undeaved. It is mainly for this reason 
that the fluorescence intensity of the 
quencher dye TAMRA changes little with 
amplification of the target. This is what 
allows us to use the 582-nm fluorescence 
reading as a normalization factor. 

The magnitude of RQ" depends 
mainly on the quenching efficiency in- 
herent in the specific structure of the 
probe and the purity of the oligonucle- 
otide. Thus, the larger RQ~ values indi- 
cate that probes AM4, Al-19, Al-22, and 
Al-26 probably have reduced quenching 
as compared with Al-7. Still, the degree 
of quenching is sufficient to detect a 
highly significant increase in reporter 
fluorescence when each of these probes 
is cleaved during PCR. 

To further investigate the ability of 
TAMRA on the 3' end to quench 6-FAM 
on the 5' end, three additional pairs of 
probes were tested in the 5' nuclease 
PCR assay. For each pair, one probe has 
TAMRA attached to an internal nucle- 
otide and the other has TAMRA attached 
to the 3' end nucleotide- The results are 
shown in Table 2. For all three sets, the 
probe with the 3' quencher exhibits a 
ARQ value that is considerably higher 
than for the probe with the internal 
quencher. The RQ" values suggest that 
differences in quenching are not as great 
as those observed with some of the Al 
probes. These results demonstrate that a 
quencher dye on the 3' end of an oligo- 
nucleotide can quench efficiently the 



* 

TABLE 2 Results of 5' Nuclease Assay Comparing Probes with TAMRA Attached to an Internal or 3'-terminal Nucleotide 

SIS nm 



582 nm 



Probe 


no temp. 


+ temp. 


no temp. 


+ temp. 


RQ" 


RQ + 


ARQ 


A3-6 
A3-24 


54.6 ± 3.2 
. 72.1*2.9 


84.8 ± 3.7 
236.5 ± 11,1 


116.2 ± 6.4 
84.2 ± 4.0 


115.6 ± 2.5 
90.2 ± 3.8 


0.47 ± 0.02 
0.86 ± 0.02 


0.73 ± 0.03 
2.62 ± 0,05 


0.26 ± 0.04 
1.76 ± 0.05 


P2-7 
P2-27 


82.8 ± 4.4 
113.4 ±6.6 


384.0 ± 34.1 
555.4 ± 14.1 


105.1 ± 6.4 
140.7 ± 8.5 


120.4 ± 10.2 
118.7 ± 4.8 


0.79 ± 0.02 
0.81 ± 0.01 


3.19 ±0.16 
4.68 =t 0.10 


2.40 ±0.16 
3.88 ± 0.10 


PS- 10 
P5-28 


77.5 ± 6.5 
64,0 ± 5.2 


244.4 ± 15.9 
333.6 ±12.1 


86.7 ± 4.3 
100.6 ± 6.1 


95.8 ± 6.7 
94.7 ± 6.3 


0.89 ± 0.05 
0.63 ± 0.02 


2.55 ± 0.06 
3,53 ± 0.12 


1.66 ±0.08 
2.89 ±0.13 



Reactions containing the indicated probes and calculations were performed as described in Material and Methods and in the legend to Fig. 2. 
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fluorescence of a reporter dye on the 5' 
end. The degree of quenching is suffi- 
cient for this type of oligonucleotide to 
be used as a probe in the 5' nuclease PCR 
assay. 

To test the hypothesis that quenching 
by a 3' TAMRA depends on the flexibility 
of the oligonucleotide, fluorescence was 
measured for probes in the single- 
stranded and double-stranded states. Ta- 
ble 3 reports the fluorescence observed 
at 518 and 582 nm, The relative degree 
of quenching is assessed by calculating 
the RQ ratio. For probes with TAMRA 
6-10 nucleotides from the 5' end, there 
is little difference in the RQ. values when 
comparing single-stranded with double- 
stranded oligonucleotides. The results 
for probes with TAMRA at the 3' end are 
much different. For these probes, hy- 
bridization to a complementary strand 
causes a dramatic increase in RQ. We 
propose that this loss of quenching is 
caused by the rigid structure of double- 
stranded DNA, which prevents the 5' 
and 3' ends from being in proximity. 

When TAMRA is placed toward the 3' 
end, there is a marked Mg 2 * 1 * effect on 
quenching. Figure 3 shows a plot of ob- 
served RQ values for the Al series of 
probes as a function of Mg* + concentra- 
tion. With TAMRA attached near the 5' 
end (probe Al-2 or Al-7), the RQ value at 
0 mM Mg 2 * is only slightly higher than 
RQ at 10 mM Mg 2 *. For probes Al-19, 
Al-22, and Al-26, the RQ values at 0 mM 
Mg 2 "*" are very high, indicating a much 



reduced quenching efficiency. For each 
of these probes, there is a marked de- 
crease in RQ at 1 mM Mg 2 "** followed by 
a gradual decline as the Mg 2 "*" concen- 
tration increases to 10 mM. Probe Al-14 
shows an intermediate RQ value at 0 mw 
Mg 2 * with a gradual decline at higher 
Mg 2 * concentrations. In a. low-salt en- 
vironment with no Mg 2+ present, a sin- 
gle-stranded oligonucleotide would be 
expected to adopt an extended confor- 
mation because of electrostatic repul- 
sion. The binding of Mg 2 * ions acts to 
shield the negative charge of the phos- 
phate backbone so that the oligonucle- 
otide can adopt conformations where 
the 3' end is close to the 5' end. There- 
fore, the observed Mg 2 " 1 " effects support 
the notion that quenching of a 5' re- 
porter dye by TAMRA at or near the 3' 
end depends on the flexibility of the oli- 
gonucleotide. 

DISCUSSION 

The striking finding of this study is that 
it seems the rhodamine dye TAMRA, 
placed at any position In an oligonucle- 
otide, can quench the fluorescent emis- 
sion of a fluorescein (6-FAM) placed at 
the S' end. This implies that a single- 
stranded, double-labeled oligonucle- 
otide must be able to adopt conforma- 
tions where the TAMRA is close to the 5' 
end. It should be noted that the decay of 
6-FAM in the excited state requires a cer- 
tain amount of time. Therefore, what 



TABLE 3 Comparison of Fluorescence Emissions of Single-stranded and 
Double-stranded Fluorogenic Probes 



518 nm 



582 nm 



RQ 



Probe 


ss 


ds 


ss 


ds 


ss 


ds 


Al-7 


27.75 


68.53 


61.08 . 


138.18 


0.45 


0.50 


Al-26 


43.31 ' 


509 .38 


53.50 


93.86 


0.81 


5.43 


A3-6 


16.7S 


62.88 


39.33 


165.57 


0.43 


0.38 


A3-24 


30.05 


578.64 


67.72 


140.25 


0.45 


3.21 


P2-7 


35.02 


. 70.13 


54.63 


121.09 


0.64 


0.58 


P2-27 


39,89 


320.47 


65.10 


61.13 


0.61 


5.25 


P5-10 


27.34 


144.85 


61.95 


165.54 


0.44 


0.87 


PS-28 


33.65 


462.29 


72.39 


104.61 


0.46 


4.43 



(ss) Single-stranded. The fluorescence emissions at 518 or 582 nm foi solutions containing a final 
concentration of SO nw indicated probe, 10 mM Tris-HCl (pH 8.3), 50 mM KCI, and 10 mM MgCl 2 . 
(ds) Double-stranded. The solutions contained, in addition, 100 nM A1C for probes Al-7 and 
Al-26, 100 nM A3C for probes A3-6 and A3-24, 100 nM P2C for probes P2-7 and P2-27. or 100 dm 
P5C for probes P5-10 and PS-28. Before the addition of MgCli, 120 nl of each sample was heated 
at 95X for 5 min. Following the addition of 80 yd of 25 mM MgCl* each sample was allowed to 
cool to room temperature and the fluorescence emissions were measured. Reported values are 
the average of three determinations. 



matters for quenching is not the average 
distance between 6-FAM and TAMRA 
but, rather, how close TAMRA can get to 
6-FAM during the lifetime of the 6-FAM 
excited state. As long as the decay time of 
the excited state is relatively long com- 
pared with the molecular motions of the 
oligonucleotide, quenching can occur. 
Thus, we propose that TAMRA at the 3' 
end, or any other position, can quench 
6-FAM at the 5' end because TAMRA is in 
proximity to 6-FAM often enough to be 
able to accept energy transfer from an 
excited 6-FAM. 

Details of the fluorescence measure- 
ments remain puzzling. For example, Ta- 
ble 3 shows that hybridization of probes 
Al-26, A3-24, and P5-28 to their comple- 
mentary strands not only causes a large 
increase in 6-FAM fluorescence at 518 
nm but also causes a modest increase in 
TAMRA fluorescence at 582 nm. If 
TAMRA is being excited by energy trans- 
fer from quenched 6-FAM, then loss of 
quenching attributable to hybridization 
should cause a decrease in the fluores- 
cence emission of TAMRA. The fact that 
the fluorescence emission of TAMRA in- 
creases indicates that the situation is 
more complex. For example, we have an- 
ecdotal evidence that the bases of the 
oligonucleotide, especially G, quench 
the fluorescence of both 6-FAM and 
TAMRA to some degree. When double- 
stranded, base-pairing may reduce the 
ability of the bases to quench. The pri- 
mary factor causing the quenching of 
6-FAM in an intact probe is the TAMRA 
dye. Evidence for the importance of 
TAMRA is that 6-FAM fluorescence 
remains relatively unchanged when 
probes labeled only with 6-FAM are used 
in the 5' nuclease PCR assay (data not 
shown). Secondary effectors of fluores- 
cence, both before and after cleavage of 
the probe, need to be explored further. 

Regardless of the physical mecha- 
nism, the relative independence of posi- 
tion and quenching greatly simplifies 
the design of probes for the 5' nuclease 
PCR assay. There are three main factors 
that determine the performance of a 
double-labeled fluorescent probe in the 
5' nuclease PCR assay. The first factor is 
the degree of quenching observed in the 
intact probe. This is characterized by the 
value of RQ", which is the ratio of re- 
porter to quencher fluorescent emis- 
sions for a no template control PCR. In- 
fluences on the value of RQ" include 
the particular reporter and quencher 
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FIGURE 3 Effect of Mg 2+ concentration on RQ ratio for the Al series of probes. The fluorescence 
emission intensity at 518 and 582 am was measured for solutions containing 50 nM probe, 10 mM 
Trls-HQ (pH 8.3) f 50 mM KCl, and varying amounts (0-10 mM) of MgCI 2 . The calculated RQ 
ratios (518 mn intensity divided by 582 nra intensity) are plotted vs. Mgd 2 concentration {mw 
Mg). The key (upper right) shows the probes examined. 



dyes used, spacing between reporter and 
quencher dyes, nucleotide sequence 
context effects, presence of structure or 
other factors that reduce flexibility of 
the oligonucleotide, and purity of the 
probe. The second factor is the efficiency 
of hybridization, which depends on 
probe T mt presence of secondary struc- 
ture in probe or template, annealing 
temperature; and other reaction condi- 
tions. The third factor is the efficiency at 
which Taq DNA polymerase cleaves the 
bound probe between the reporter and 
quencher dyes. This cleavage is depen- 
dent on sequence complementarity be- 
tween probe and template as shown by 
the observation that mismatches in the 
segment between reporter arid quencher 
dyes drastically reduce the cleavage of 
probe/ 1 * 

The rise in RQ" values for the Al se- 
ries of probes seems to indicate that the 
degree of quenching is reduced some- 
. what as the quencher is placed toward 
the 3' end. The lowest apparent quench- 
ing is observed for probe Al-19 (see Fig. 
3) rather than for the probe where the 
TAMRA is at the 3' end (AJ-26). This is 
understandable, as the conformation of 
the 3' end position would be expected to 
be less restricted than the conformation 
of an internal position. In effect, a 
quencher at the 3' end is freer to adopt 
conformations close to the 5' reporter 
dye than is an internally placed 
quencher. For the other three sets of 



probes, the interpretation of RQ." values 
is less clear-cut. The A3 probes show the 
same trend as Al, with the 3' TAMRA 
probe having a larger RQ~ than the in- 
ternal TAMRA probe. For the ?2 pair, 
both probes have about the same RQ~ 
value. For the P5 probes, the RQ" for the 
3' probe is less than for the internally 
labeled probe. Another factor that may 
explain some of the observed variation is 
that purity affects the RQ" value. Al- 
though all probes are HPLC purified, a 
small amount of contamination with 
unquenched reporter can have a large ef- 
fect on RQ~. 

Although there may be a modest ef- 
fect on degree of quenching, the posi- 
tion of the quencher apparently can 
have a large effect on the efficiency of 
probe cleavage. The most drastic effect is 
observed with probe Al-2, where place- 
ment of the TAMRA on the second nu- 
cleotide reduces the efficiency of cleav- 
age to almost zero. For the A3, P2, and P5 
probes, ARQ is much greater for the 3' 
TAMRA probes as compared with the in- 
ternal TAMRA probes. This is explained 
most easily by assuming that probes 
with TAMRA at the 3' end are more likely 
to be cleaved between reporter and 
quencher than are probes with TAMRA 
attached internally. For the Al probes, 
the cleavage efficiency of probe Al-7 
must already be quite high, as ARQ does 
not increase when the quencher is i 
placed closer to the 3' end. This illus- I 



trates the importance of being able to 
use probes with a quencher on the 3' 
end in the 5' nuclease PCR assay. In this 
assay, an increase in the intensity of re- 
porter fluorescence is observed only 
when the probe is cleaved between the 
reporter and quencher dyes. By placing 
the. reporter and quencher dyes on the 
opposite ends of an oligonucleotide 
probe; any cleavage that occurs will be 
detected. When the quencher is attached 
to an internal nucleotide, sometimes the 
probe works well (Al-7) and other times 
not so well (A3-6). The relatively poor 
performance of probe A3-6 presumably 
means the probe is being cleaved 3' to 
the quencher rather than between the 
reporter and quencher. Therefore, the 
best chance of having a probe that reli- 
ably detects accumulation of PCR prod- 
uct in the 5' nuclease PCR assay is to use 
a probe with the reporter and quencher 
dyes on opposite ends. 

Placing the quencher dye on the 3' 
end may also provide a slight benefit in 
terms of hybridization efficiency. The 
presence of a quencher attached to an 
internal nucleotide might be expected to 
disrupt base-pairing and reduce the T m 
of a probe. In fact, a 2°C~3°C reduction 
in T m has been observed for two probes 
with internally attached TAMRAs. (9) This 
disruptive effect would be minimized by 
placing the quencher at the 3' end. Thus, 
probes with 3' quenchers might exhibit 
slightly higher hybridization efficiencies 
than probes with internal quenchers. 

The combination of Increased cleav- 
age and hybridization efficiencies means 
that probes with 3' quenchers probably 
will be more tolerant of mismatches be- 
tween probe and target as compared 
with internally labeled probes. This, tol- 
erance of mismatches can be advanta- 
geous, as when trying to use a single 
probe to detect PCR-amplified products 
from samples of different species. Also, it 
means that cleavage of probe during PCR 
is less sensitive to alterations in an- 
nealing temperature or other reaction 
conditions. The one application where 
tolerance of mismatches may be a disad- 
vantage is for allelic discrimination. Lee 
et al. (1> demonstrated that alleie-spedfic 
probes were cleaved between reporter 
and quencher only when hybridized to a 
perfectly complementary target This al- 
lowed them to distinguish the normal 
human cystic fibrosis allele from the 
AFS08 mutant. Their probes had TAMRA 
attached to the seventh nucleotide from 
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the 5' end and were designed so that any 
mismatches were between the reporter 
and quencher. Increasing the distance 
beween reporter and quencher would 
lessen the disruptive effect of mis- 
matches and allow cleavage of the probe 
on the incorrect target. Thus, probes 
with a quencher attached to an internal 
nucleotide may still be useful for allelic 
discrimination. 

In this study loss of quenching upon 
hybridization was used to show that 
quenching by a 3' TAMRA is dependent 
on the flexibility of a single-stranded oli- 
gonucleotide. The increase in reporter 
fluorescence intensity, though, could 
also be used to determine whether hy- 
bridization has occurred or not Thus, 
oligonucleotides with reporter and 
quencher dyes attached at opposite ends 
should also be useful as hybridization 
probes. The ability to detect hybridiza- 
tion in real time means that these probes 
could be used to measure hybridization 
kinetics. Also, this type of probe could be 
used to develop homogeneous hybrid- 
ization assays for diagnostics or other ap- 
plications. Bagwell et al. (,0) describe just 
this type of homogeneous assay where 
hybridization of a probe causes an in- 
crease in fluorescence caused by a loss of 
quenching. However, they utilized a 
complex probe design that requires add- 
ing nucleotides to both ends of the 
probe sequence to form two imperfect 
hairpins. The results presented here 
demonstrate that the simple addition of 
a reporter dye to one end of an oligonu- 
cleotide and a quencher dye to the other 
end generates a fluorogenic probe that 
can detect hybridization or PGR amplifi- 
cation. 
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Wc have developed a novel "real time" quantitative PCR method. The method measurer PCR proclnn 
accumulation Through a dual-labeled fluorogenle probe (i.c., TaqMan Prob*). This method provides very 
accurate and reproducible quantitation of gene copies. Unlike other quantitative PCR methods, real-time PCR * 
does nor require post-PCR sample handling, preventing potential PCR product carry-over contamination and 
resulting m much faster and higher Throughput assays. The real-ilmo PCR method has a very large dynamic 
rantre of starting rarser molecule determination (at least five orders of magnitude). Real-lime Quantitative 
PCR is extremely accurate and less labor-intensive than current quantitative PCR methods. 



Quantitative nucleic acid sequence analysis has 
had an important role in many fields of biologi- 
cal research. Measurement of gene expression 
(RNA) has brant -used extensively In monitoring 
biological responses lo various stiniuii (Tan et 
1994; Huang el al. I995a,b; Prud'homme et al. 
1995). Quantitative gene analysis (DNA) has 
bi-en used "to cK-i ermine the genome quantity of a 
particular gene, as in the case or the human H1LR2 
gene, which Is amplified in : -30% of breast tu- 
mors (Slamon et -al. 1987). Genie and genome 
quantitation (DNA and UNA) also have been used 
for analysis of human immunodeficiency virus 
(ilJV) buTden demonstrating changes in the lev- 
els of virus throughout the different phases of the 
disease (Connor et al. 1993; Platak et al, jyy:tn; 

Pintado et al. 1995)- 

Many methods have heen described for The 
quantitative analysis ot nucleic acid sequences 
(both for RNA and DNA; Southern 19/5; Sharp et 
al. 1980; Thomas 19KO). Recently, PCR has 
proven to be a powerful tool for quantitative 
nucleic acid analysis. PCJR and reverse transcrip- 
tase (KTJ-PCR have permitted the analysis of 
minimal starting quantities of nucleic acid (as 
little as one cell equivalent). This has mode pos- 
sible many experiments that could not hove been 
performed with traditional methods. Although 
PCR has provided a powerful tool, it is imperative 
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l|iai it he u&cd properly for quantitation (W»uy- 
maekers 1995). Many early reports of quantita- 
tive PCR and RT-PCR described quantitation of 
the PCR product hut did not measure the initial 
target sequence quantity. It is essentia] to design 
proper controls for the quantitation of the initial 
target sequences (Hcrrc 1992; Clement I el al. 
100?) 

Re.s*:ftfchcrs have developed several methods 
of quantitative PCR and RT-PCR. One approach 
measures 1*CR product quantity in the log phase 
of the reaction before the plateau (Kellogg et al. 
1990; Pang et a), 1990). This method requires 
that each sample has equal input amounts' of 
nudeir add and that each sample under analysis 
amplifies with iUvuf ical efficiency up to (he. point 
of quuutilalivc analysis. A gene sequence (con- 
tained hi all .samples at relatively constant quan- 
tities, such as p-aelin) can be us«d for sample* 
amplification eiiieiency normalization. Using 
conventional methods of PCR detection and 
quantitation (gd electrophoresis ot plate capture 
hybridization), it is extremely laborious to assuie 
that all samples are analyzed during the log phase 
of the reaction (for both the target gene and the 
normalization gene), Another method, quantita- 
tive competitive (QQ'KCR, has been developed 
and h used widely for PCR quantitation. QC-PCR 
n:lics on the inclusion of an internal control 
competitor in each reaction (Becker- Andre 1991; 

Plata k cl al* 1993ajb). The efficiency of each re- 
action is normalised to the Internal competitor* 
a if«nwn amount of Interna] competitor ean be 
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added to each sample. To obtain relative quant* 
ration, the unknown target PGR product U com- 
pared with the known competitor \K'M product, 
.Success of a quant 11 alive competitive PCU assay 
relies pn developing an Internal ewmroi thai am- 
jiiirii^s with the same efficiency as the hugvi mol- 
ecule. The design of The corupctUoi and the vali- 
dation of amplification efficient::* icquirc a 
dedicated effort. However, because QC^-PCK does 
not require t?ud PCU puxlucls be analysed during 
the io& phase of I lie. amplification, ft is tint evade* 
uf the two methodA to us'e. 

Severn! detection »y«tciu» tue u*ed for quan 
Utative l'CR and RT-1>C:K analysis? (1) agiimaa 
geht, (2) fluorescent labeling uf Kill products and 
detection with limer-induccd flucirCAveiice \i9ln$ 
capMlary elcerrophoresia (Kmsco et ah 1995; WIU 
lifims eT ah 1996) or acryiaiulde gels, und (3) plate 
capture and sandwich probe hybridan lot t (Mul- 
der el ah 1994). Although these method* pmved 
successful, each method requires posl-PCR ma- 
nipulations That add rime to the analysis and 
may lead lu hibu'utoty \ onln mi nation. The 
sample throughput uf Ihexr jurlhud* i.s limited 
(wUJi Ihtr i-xtcpllon of the plate capture ap- 
proach), mu\, llKtrtiftjru, these methods, ere not 
well suited ftu u*e* demanding high stun pie 
throughput (I.e., JiCTeeniiiK of large numbers of 

bfoiuwlcvulc:* 01 aiutty/.hl^ jVAmpli'a fwj diagnua- 
Uc$ e»r clinical trials), 

1 lore, we report the development of n novel 
a.vsay for quantitative IWA analysis. The assay is 
husfcd on the use of (he; &' ' nuc-Ieaae assay first 
described by Holland et al. (1993). The method 
w.ses the 5' nuclease, activity of Ttuf polymerase to 
cleave a n on extendible hybridization probe dur- 
ing the extension phew of T'CH. TV hi cqypruach 
uses dual-labeled fluoro^enic hybridisation 
probes (Lee cl a). 19^3; noisier ct ah 1995; l.ivak 
ct h1, 1y96a,b). One fluorescent dye serves as a 
reporter [F AM (i.e., (^corboxy fluorescein)! and its 
emission spectra is quenched by the second fluo- 
rescent dye, T AM HA (i.e., fi-c.triv>xy-ietr«methyl- 
rhodaminc). The nuclease degradation of the hy- 
brid! /.urlon probe releases the quenching of Ihe 
I'AM fluorescent euussluu, rebuHing in an In- 
crease in peak fluorescent emission at S]« nm, 
"Hie use of <a sequence detector (A13I Prism) allows 
measurement of fluoreMrunt spectra of all 96 wells 
uf rMe thermal cycler continuously during the 
J'CK amplication. Therefore, the reactions uje 
munltnrvd in real time. The output data is de- 
scribed and quantitative untilysb of input target 
I )NA sequences 15 discussed below. 
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RESULTS 

PCR Produce Detection in Real Time 

Hie goal was to develop a high-throughput, sen- 
sitive! and accurate gene quant lint Inn assay for 
use In monitoring lipid mediated therapeutic 
gene delivery. A pi asm id un coding human factor 
VIM gene j&cqu«iice p pI»8TM (see. Methods), wax 
used as a model therapeutic gtsne- The Bsst»y use* 
fluorescent Taqmun methodology and ati instru- 
ment capable of measuring fluorescence in real 
lime <AU1 Prism. 7700 Sequence Hcleelnr). llu: 
TiU]MK*n react ion requires » hybrldh&atUv)] pmhe 
Ia1>cled witJj two different fluorescent dyes. One 
dye Is a reporter dy« (I'AM), the othe-r ix ^ cjueneh* 
injj dye (TAMRA). When the pn>U: \s intact, fluo 
i esc en I energy transfer occurs and the reporter 
dye fluorescent emission ia absorbed by ihe 
quenching dye (TAMRA). During the extension 
phase of the PCK cycle, the .fluorescent hybrid- 
1/^1 linn probe It cleaved by the S'-.T nucleolytic 
octivity of the: DNA polymerase. On cleavage of 
the probe, the reporter dye. emission Is no lunger 
transferred efficiently to the (.juenching dye, re 
su J til »k b» an increase of the report vr dye fluores- 
cent enii-i^ltiii tipeetra. PCll primers and probo* 
were vleriijjuitd foi I lie huiritin fctclor VI 1J se- 
queme and hunsan p-actin gene (us described in 
Methods). Optimization reactions were per- 
formed to choose the appropriate probe and 
magnesium concentrations yielding the highest 
Intensity "of reporter fluorescent signal without 
sacrificing specificity. The Instrument uses a 
cnarKc-caupied device (i.e., CCD eainero) for 
measuring the fluorcriccnt emission apeetni from 
500 to rtSO nm. ICach VCAX lube was monitored 
sequentially for 25 rn.iue wltli continuous jnoni- 
torinu thnmflhout the ampHficutipn. liach tube 
wan rr.-cxanilncJ every 8.5 see. Computer soft- 
ware, was dc-jsij-ne.d to examin? the flu orescent In- 
tensity of both the reporter dye (PAM).and 
the quenching dye (TAMRA), The lhiorc*$ccnt 
intensity of t)ic quenclif ng dye, TAW UA, changes 
very Utile over the course of the PCR ampllfl* 
cation (data not i'liown). Therefore, the intensity 
of TAMKA dye emission serves as an Internal 
.ttaiiUard with which to norntullv^ the reporter 
dye. (1 : AM) cmlsftlon variatJoits. The software cal- 
culates .) value termed AKn (or AftQ) uMn£ the. 
following equation: ARn - (l\n J ) (Rn"), where 
lin 4 enilwiun hjlwisily \>t reporter/emission in- 
tensity of quencher at nny given time In a rcae 
don tube, and ftu r emission Jntensitily of re- 
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poner/omlsMon )m«i*ily uf qucn'cUer measured 
prior 10 I'CK iintplilicalion in that same reaction 
lube. l ; or the purpose of quantitation, the W 
three data points (ARns) collected during the. e** 
tension step for each J J CK cycle were analyzed. 
The nucleolylic degradation of the. hyufidisotiion. 
probe occurs during the extension phase or I't Jt, 
and, thi:n>foro, reporter fluorescent cnuajijun In- 
creases during this time, 'nu: thitv data point* 
were averaged for each k;K cycle and the mean 
value fur each was plotted in an "amplification 
l>lot" shown In JHflurc j;v. The Ai<n mean viiku' is 
plotted on the )*.axis, and time, represented by 
cycle number, is plot lad on the *-axis. During the 
KarJy cycles of the T'CR amplification, the ARn 



A 



value remains at base lino when sufficient hy- 
bridi/allon probe has been cleaved by Ihe Tut] 
]>oly merit sc nwclfcAfio activity, thu intensity of re- 
porter fluimviccnt emission increase. Mom PCR 
amplifications reach u plateau phofte of reporter 
fJuurocvtni emission if the reHuliun Is carried out 
to high cycle uuiiibeis- The ajr^lificallon plot \'J 
examined vuily in lb* reaction, ut a point (hat 
it-presents \\w log -phase of prudud accumula- 
tion. This is done by ustignlng an arbiLiary 
ihreshoJd thul is based on the variability of the 
base-line dyU- In Wgure 1 A, the threshold wfts set 
ui iu st an (Sard deviation* above the mean of 
Viaftc lino emission calculated from lydo 1 Lo 1 !v 
Once the threshold is chosen, the point at which 
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figure 1 PCR product detection in real time. (A) The Model 7700 Mjllware will construct amplificatipn plots 
from the extension phase fluorescent emission data collected during the PCR amplification. The standard de- 
viation is determined from the data points collected from thft base line of the amplification plot C, values are 
calculated by determining the poinl at which the fluorescence exceeds a threshold limit (usually 10 times the 
standard deviation of the base line). (S) Overlay ot amplification plots of serially (1:2) diluted human genomic 
DNA samples amplified with pectin primers. (C) Input ONA concentration of the samples plotted versus C T . All 
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the amplification plot crowco thethrosihoId'ivcUt 
fined as C r . C r is reported ufi the cycle number u\ 
tlii?: point. An will be demonstrated* lh« CI, .value 
Is pieUicLivt of ihc quantity of input tnrgttl.- 

Cr Values Provide a Quantitative Mea^u rem cnr.or* 
Input Targer Sequences 

Figure 1H shows amplification plot* of H».di'rT«w. 

enl PCR amplifications overlaid. 'Hie Amplify. 
Hons wore performed on a 1:2 serial dlhit-kw 'id 
human genomic JWA. J-hc amplified tar^ei vwu 
human (J octln. The amplification plot* Khifl to 
the right (to higher threshold cycles) n* the input 
target quantity ii reduced. 'Ihifi is expected ho. 
eaunu rmiettoriK with fewer starting enpitu; of the 
largct molecule require, greater amplification lo 
degrade enough probe to attain the threshold 
fluorescence. An arbitrary threshold of 10'stan- 
dard deviations above the base line was used to 
determine the C, r values. Figure* 1C represents the 
C T value* plotted versus the sample dilution 
value, Koch dilution was amplified in triplicate 
PCR Amplmcaiinns and plotted as mean values 
with error bais representing one standard devia- 
tion. The C T values decrease linearly with Increas- 
ing target quantity, Thus, c; r values can be used 
as a quantitative measurement of ttio input target 
number It should be noted that the amplifica- 
tion plot for the 15.6. ng sample shown In Plgure 
IB does not reflect the same fluorescent rate of 
increase exhibited by mosr of the other samples. 
'Hie 15.6-ng sample also achieves- endpoint pla- 
teau at a lower fluorescent value than would he 
expected based on rhi» input UNA. This phtrnom* 
enon has been observed occasionally with other 
samples (data not shown) and may be attribut- 
able to. lute cycle inhibition; this hypothesis is 
SUM under investigation. It is important to note 
that the flattened slope and early plateau do not 
impact signifies itjy the calculated O, value as 
demonstrated by the fit on ihe line shown m 
Figure. 1 C All triplicate amplifications resulted in 
very similar Cr values— the standard deviation 
did not exceed 0.6 for any dilution. This experi- 
ment contains a > 1 00,000-fold range of Input tar- 
get molecules. Using Cy values for quantitation 
permits a much larger assay range than dJrecliy 
using total fluorescent emission intensity for 
quantitation. The linear range. oi iluorescent in- 
tensity measurement or the. AIM J'rlsm 7700 Sc- 
• . * 

arwvj 



meiVts over n very large t^hjjo nf r?*Ujtvr» d^rt 
target quantities. 

Sample Preparation Validation 

Several parameters influence the efiic-ienry 
PCM amplification: magnesium and sail cone 
nation*:, reaction conditions (i.e., time and U 
peraturo), PCK target size and composite 
primer sequences, and sample purity. All of 
above factors are common to a single VCR ass 
except sample to sample purity, in an effort 
validate the. method of sample preparation 
theiacior VJ11 assay, PCKampliticotion repaid 
ihiHty and oiflclency ol 3 0 replicate sain 
preparations were examined. After genomic l> 
was prepared from the 10 replicate samples, i 
DNA was qunntUaicd by ultraviolet spccirosco 
Amplifications were performed analyzing p-nc 
Kcjh: content in 100 and 25 ok of total kumot; 
DNA. Each VCR amplification was performed 
triplicate. Comparison of C r values for each tr 
licate sample show minimal variation based 
standard deviation and coefficient of variar 
(Tabic 1). Therefore, each oi the triplicate Pi 
amplifications was highly reproducible, demr 
stinting that real time. i>CK using this instrum< 
in lion introduces minimal variation into t 
quantitative. J'CIt analysis. Comparison of r 
mean C n values of Die 10 replicate sample prey 
rations also showed minimal variability, indie 
ing that each sample preparation yielded siinj. 
results for p-ncl»n gene quantity. The highest . 
difference between any of me samples was 0. 
and 0.71 for the 100 and 25 ng samples, rcspi 
lively. Additionally, the amplification of t:a< 
sample exhibited an equivalent rate of fluorc 
cent emission intensity change per amount 
DNA target analyzed as indicated by simil. 
dopes derived from Ihc sample dilutions (Pig. 2 
Any 5am pie containing an excess of a i'CU inhih 
lor would exhibit a greater measured p-actin C 
value for a given quantity of UNA. In adctitioj 
the Inhibitor would be diluted along with ih 
sample in the dilution analysts (Hg, i) t aiterln 
the expected C r value change. Each .sample an 
pilflCDiion yielded a similar result in the analyst- 
demomtra ting- that this method of .sample prcpa 
ration is highly reproducible wMh rcgnrd u 
sample purity. 

Ouaneitarive Analvsis of a Plasmid After 
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TabU 1. Reproducibility of Sample Preparation Method 



Samplo 
no. 



10 



Mean 



100 ng 



Ct 



standard 
rn&£»n deviation 



1B.24 
18.23 

18.33 

18.35 

Ift;^ 

IB, 3 

18.3 

18,42 

18.15 

18.23 

1S.32 

18.4 

1838 

18.46 

18.S4 
18.67 
19 

18.28 

18.36 

18.52 

18.45 

1B.7 

18.73 

18.18 

18.34 
IB. 26 

18.42. 
18.57 

18.66 

0 io) 



1 W.27 0.06 



ia^7 0.06 



18.34 0.07 



1 8.23 O.OS 



18.42 0.04 



18.74 0.24 



18.39 0.12 



18.63 0.16 



18.29 0.1 



18.55 
18,<12 



0.12 
0.17 



cv 



0,32 
0.36 

0.46 
0,23 

1.26 

0.66 

0,83 

0,55 

0.65 
0,90 



20.46 
20.55 
20,5 
20.61 
20.59 
70.41 
20.54 
20.6 
20.49 
20.48 
20.44 
20.38 
20.68 
20.87 
20,63 
21.09 
_21.Q4 
21.04 
20.67 
20,73 
20.65 

20.98 
20.84 
20.75 
20,46 

20.54 
20.48 

20.79 
20.78 

20.62 



* * « 



25 ng 



mean 



20.43 



20.51 

20.73 
20.66 



standard 
deviation 



20,51 0.03 



0.11 



20.54 0.06 



0.05 



20.73 0.13 



21.06 0.03 



20.68 0.04 



20.86 0.12 



0.07 

0,1 

0.19 



cv 



0,17 
0.54 
0,28 
0.26 
0.61 

0.15 

0.2 

0.57 

0.32 

0.16 
0.94 



tor containing a partial cDNA for human factor 
VJH, pl-'oTM. A scries of tranrfectiom was sot 
up using a decreasing amount of the plasmid v (40, 
A, 0.5, and 0.1 u.g), 1\vr.niy-Hiur hours po.sl- 
tninafci'iinn, total r>NA w<i* purified from each 
flask uf crib. p-Aciiu gene LjueuHlly yyju chuM;n <*s 
a value for normaU^U *n or gcifiuiiiir. I'JNA con- 
centration from each sample. In this cxpeiiiiieut, 
(5-actm gene content should remain constant 
relative to roral genomic DNA. Figure 3 shows the 
result oT the p-actin DNA measurement (100 Jig 
total DNA determined by ultraviolet spectros- 
copy) Ot each sample. Kach sample was analyzed 
in triplicate and the mean p-actin Cr values of 
the triplicates were plotted (error bars represent 



between any two sim pi a maam was 0.9 S C,, Ten 
nanograms of total DNA of each sample were also 
examines! (ur fl-actln. llic results aguin showed 
that very similar amounts of genomic 1>NA wore 
present; tin: maximum mean |i sctin C: x value 
difference wha 1 .0. As J'igurc 3 shows, ihe rate of 
p-actin C r chun^v l>ctwocn the 100 and 10-ng 
sample? was simitar (slope values r;mg« bwrwoon 
3.56 and -'3.45), Tni.% verifies again that the: 
method of .sample prcpurailon yields sainples of 
identical PCR integrity (i-<*-. «o sample contained 
an excessive amount of a VCR inhibitor). TTow- 
evcx, these results indicate that cuch sample; con- 
tained slight diffciences in the actual amount of 
genomic DNA analy/xtd. Determination of actual 
uenonnc ON A concent ration wos accomplished 



^ r> rt t* fs f\ i 



f> O TTXf T 



r 

PHONE No. : 318 472 0905 Dec. 05 2002 12:24AM F 



M Al MM»- OUANTITATIVH PC 









21- 








. ZU- 


> 










10 




1&.6 




id 



Sample 

— ♦ - 1 

• • 9 

• * a 

• ?• 4 

•* 5 

o ft 

u*-T 

a a 
* » 
« in 




i 

1.4 



i 

1.6 



i 

1.0 



i 

1.7 



) 

IJI 



I 

I.B 



i 

2 



M 



log (ng Input genomic DNA) 

Figure 2 Sat i iple preparation purity. 1 he replicate 
camples shown In Table 1 woro aUo amplified In 
tripicate Vising 2S ng of each DNA sample. The fig* 
cue sl'iowi the input DNA concentration (100 and 
25 ng) vs. C, In \ht> Unnrp. ihp> 100 and 75 ng 
points for «ach sample are connected by a line. 



by plotting the mean fs-ucu'n O, value obtained 
for «ac:h 100* lig stunplu on a~p-ocihi standard 
v.nive (shown In IHg. 40). The actual genomic 
ONA concentration of each sum pit:, tt 0 was ob 
tallied by extrapolation to thuX uxii, 

Figure 'I A shows the measured' (t.u. f m>n» 
normallr.ed) epuuililie;:. of factor VJ)J plnnmUi 
I3NA (pPSITvt) from each of the four transient cell 
trori<fifectlrm&, Each reaction contained 300 of 
total sample. I3NA (as determined by UV spectros- 
copy). V&ch Sample was analyzed in tri pi U:uU* 
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Figure 3 Analyst uf Ueinsfectcd cell DNA quo n Uly 
and purity. I he DNA preparations* of the four 293 
cell transections (40, 4, 0.5, and 0.1 pg of pF8TM) 
were analyzed for the 3-actln gene. 1 00 and 1 0 ng 
(determined by ultraviolet spectroscopy) of each 
sample were amplified in triplicate. For each 
amount of pF8TM thai was transfectcd, the p-actln 
C T values are plotted versus the total Input DNA 



PC.U simplifications. As shown, u|>'8TM purifk 
>htj\v Jbc 293 cells decreases (moan C, values ii 
CTir-vtftj with decreasing amounts of pi asm 1 
druiu.lri.tcd. Th« mean C A values obtained ft 
pFTO *in TiyufC 4A were pJotteO ojj a slandji 
curve comprised uf svilijlly diluted pFHT> 
shown .in figure 4R. The quanllly uJ plWI'M, 
found in each of the four 1 ran Rfoct lorn was d 
temnined by cxtrnp»Jalion to \\w x uxk of t) 
standard curve In l'i^ure 4H. "nuwe. uncor recti 
values, b, for pKH'l'M w«rv nonnMHAid \a dele 
mine Uic actual amount of pl'81M ftamd pur if 
riK of genomic DNA by asirtfi the equation:. 

/> x 1f X> ng uoiual pf-B'iTvt copies per 
a T UK> ng of genomic UNA 

where a •- actual ^atomic DNA in u sainplc ar 

U w pl : H'l*M copies from the standard curve. 'H 
notmaJincd ^uarilily of pl'STM per 100 ng of g 
nornic ONA for each of The four Irnnafcctlona 
Nrtown tri Hgure 'Hium: roulh .show tliflt it 
Cjunntuy of factor vill piasuUO tissue) cited wii 
tnc 29,1 cellN, 21 lir after tmnsfvciioii, Oecnuisi 
wtm UccreaslUH pJtiMiiui uinL.wiuiatiou uxed I 
the transection. The quantity of pJ'B'J'M nssoc 
atea with 293 cells, iifici transfcalon Willi 40 p 
of niijKjnid, was 35 pgper 100 ng genoinlc 1>N< 
TlliS results In -520 plasinid copies per cell. 



DISCUSSION 

We have described a new method for qua n tit n 
in** gene copy numbers using rtaMlmc nnuly.s 
of PCX amplication*. Real-time PCK is compa 
ible with cJther of the two PCR (KT-PCK) aj 

pruadie»: (1) quanlltfllive con»f«:litivt: wlicrc a 
Internal competitor for each target sequence 
used for normali^atJon (dalo not shown) or (S 
quauiiiauvc comparative PCK ushiu » i 
do?: gene contained within the sample (i.e w p-a< 
tin) or a "housekeeping" gene for RT-PCH. ) 
equal amounts of nucleic ocitl are aaalyml f(. 
e.;ic:n sample and if the amplification cffii it.ru; 
before quantitative analysis ^ identical for eflcl 
sample, the trirernal cunlml (nujmali^jtiou ^en 
or competitor) should give equal glials for ai 
samples. 

Tlie real-time PCU method offers several ad 
vantages over the other two method"! cutrcntl; 
employed (see the introduclimi). J ; irst, the real 
time PCK method is performed in a doscd-tub 
system and requires no post -PCJR manipulatloi 
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I ^t^,?^ ? liV ° ftnal ^ of » >F8TM in Insetted cell*. 64) Amount of 
SlS f ^ U t K V runj <"^» Pitted against U.e ,n vyr , c/vWue 

r/ffru J? i rcmaimn ? *■ hr atlcr transection. (0,Q Standard curves of 

2ln£d^i "^"-"PW^" "*h Ac oppropnW primer,. The p-actin 
standard curve wa* u »d to norm* h> c the results of /I to 1 00 „g of genomic DN A. 
<0) The amount of P F8TM present per 1 00 ng D f genomic DNA. 



of sample. Therefore, tin* pott»nti«] for r»CR con- 
UmlnaUoii in the laboratory is reduced because 
amplified product* can In- annlywd and disposed 
of without opening Uw reaction lubes. Second, 
Ihis method suppoiU ihw u.s« of a iioriii,ilix;,tk>u 
«enc (U., fl-actin) for quantitative PGR or house- 
keeping genes for quantitative RT-l'CK controls. 
Analysis Js performed i>, real lime during the Jog 
phase of product accumulation. Analysis during 
)uk phase permit* many different genes (over a 
wide input tarftrt mngr) to be analyzed simulta- 
neously, without concern of reaching reaction 
plateau at different cycle*, Tim will make mulll- 
gene analysis assays much ca.Met lv develop, be- 
cause individual internal umiuctUui* will m>i !>c 
needed for coch gene under anaJyaln. TJiird, 
sample throughput vyitl iuciea.Nc diunialicitty 
with the new method because there ja no post- 
l'CR processing time. Additionally, winking In a 
^C>-wcll format is highly compatible wtth auto, 
iiiation technology, 

The real-time 1>CR mulhod is highly repro- 
d u ci ble . Re p I lea l p, a m pi I f I cati ons can be a.nu Ly zed 



for ? ach sample minimising potential error. The. 
sysntin allow* i*c>t a very large assay dynamic 
ruiifte (approaching l,000,o(X)-fold starting Ui. 
gel). tJaiiiy u Matulord curve for the target oi in- 
terval, relative copy number values can be deter- 
mined for any uukriuwji sample, fluorescent 
threshold values, C TJ -coueJair. linearly with rela- 
tive DNA copy numbers. Ilea) time quantitative 
KT-KJK methodology (Cilbwn et ah, d^is Lwuh) 
ha* aboheen developed, fimilly, real Ut»e qu^in- 
titative I*CU mctlioclology can be uaeii tu develop 
high-throughput acrccnlng aaaay.s for a mricty of 
applications [quantitative gene expieaaiuu (RT- 
I^CR), c«l'y »^oy» (Jtcr2, II1V, etc.), ^cno- 
typlng (knoeKoul mouse analysis), and Immuiiu- 
PCIIJ. 

Rcal-thue PCM may al.w be j>crformcd using 
intcTctnJnting dytis (Hlguchi ci ul. V) l )*JL) such «s 
cihJdium bromide. The fluorogenic prone 
method offers a ma|or advantage over inter- 
calating dyes- 'greater specificity (i.e./ primer 
dimers and nunspedflc PCR products art: not de.- 
tfAMed). 
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METHODS 

Generation of <\ Plasmld Containing a Partial 

cDNA far Human Factor VIII 

Tuuil RNA wa> Vitirvcaiccl (UNA*ol It from Mel Tc« f Inc., 
rricndswood, TX) from evil* i»*«wfcclL»d wtth a faetur VI 11 
expression vector, pC!|SZ.Uv?^l J (Koum el nl' Cor. 
mnn ct al. 1900). A factor VIM partial chNA wptentv WOS 
^t riciTntcd by JIT l»C*:lt ICJimoAmp W, iTlh UNA UHl KH 

(pan NtK>K-or/9, va Applied hiosystcm!;, I'ostei <;»ty, t;A)j 

using the l'C:u primers KHfor »nd l**Rrcv (prinu-r sequence* 

are xhovvn below), 'Hie amp) Icon was feamplifird osJnR 
moiiifk'U l*nfor and Wrcv primers (uppwulfd with huwlll 
And ttwdUl reslrfcilon sire sequences »t the V epilj unri 
clonal Inlo jXi)iM 3Z(JVoTiii^u CUvp., MuUuon, Wl).The 
rcsullln/cdonr, pWlTM, was usird lor iranwcm iransfcnlon 
of 3MW celLC 



Amplification of Target DNA ami Duictilon of 
Amplicon Factor VIII Plasmld DNA 

(pFHTM) Was tnnpllfUTU wlxh llic pihiK-i* ]%ftfor S'-<X;<:- 

crr(i<;c^\AUAU'jtjA(Xiicn\V3' and parcv 5 f -AAA<u7r- 

t^OCXrrOCiATCiCi rACja-^'.'Iliervnctliiu proJuevd h 122- 
np i*C:K product. The forward primer wum1v>1);jii*iI tu ict* 
ngnlxu u unique* mi|uhhi- nmud lit Uw 5 f untranslated 
rejpO" of the pa i cut pClS2,lk25l> plaMiiUl (iiuj (hitrWo/c 
clues not iwuhhUc ui id amplify llw* luiuum factor VI II 
gene* I'rimnrr. woro chanon with the si vsi via iter* of iru« com- 
puter program Olif;o 1.0 (Nulimiul llitufCionees, tne„ Ply. 
mouth, MN). The human p-actl« gwie was amplified with 
the primer* fHu-iSn forward primer £'TCACCOAOA{ ,T(JT 
GCCCATCTPACCiA-.V and fj-actuj reverse p>io>cr V.( :A(;. 

CGCAACCC':r<:Arit;c:c^A'j'GG-3'. The reaction pro- 
CKiCCO a 2V5-hp rCk product. 

Amplification reactions (50 (J) contained a UNA 
sample, Klx I'CH Buffer II (a jtl), 200 p.M dATl', dOlT, 
dG'iT, and 400 \tu rfUTl», A m\< MgCI,, l,?Ji Units AmpJI 
7Vk; DNA polymerase, 0.5 unit Ampwnsc uracil rt-Riy- 
wi.iyluae <UNG), 60 pi nolv of each ftveloi V1U prlmvi, und 1S 
ptnoU> of vHioh |< Actlrt pdrncr. f I1ui icactltiiw «:onlnlncd 
OHO Of lJK" following ({rKTtlnii prohrs (WKJ nM mrl»)» 

GCOTT(TAMRA)p 3' «ud p-nt-tiji proU- 5' (rAM)ATCJ f .:CC;- 
X(TAMRA)CCCCCATCCCATC|>-.T where p indicates 
plio^pKnrylA(irtn nrvd X indlfotw a linker arm nucleotide. 
Reaction lul*e» wen.* MittmA/np Optical Tul>cs (part AUm- 
bvr NK01 00.1.1, PcrWn ittawr) liiat worefroetMl (Ml iVrkin 
F.lrucr) to prvveoi Kghl from /cflccilngt Tube copi were 
slmil.M' tn Miort>AtVip Cinps bul specially designed lo pre* 
rent H^lit sea (term 5. -All <i( Ui<- IHIU ^nt/)umhlvlc» wcro 
( /liv:a 1>y PK Applied ltit«y*le»i3 (|!fi»U«r C!Wy # CA) cxcepl 
ihr factor VIU prliuera, wjifch wne xynthrnlxrd ut Ccnvn 
lecli, Inc. (Soulli Jan rrandsco, CA). ProUt*** ww de&Jgnud 
lining the Oliyr? 1.0 floftworc, follvwln^ guidelines kuj;. 

gCM«i in mc Model 7700 .Sequence l>cttvu>r luMiiutiieiil 
manual. Briefly, probe T m Jjimld lie Al least 5 U C ))lfjhcr 
inaii riu* Hf\iu'ulln^ itr m pi*Ml ufr u.>cd durl/ij; (firrmul ey- 
rlutgi prirncrs should not /unu hidhk- duplexed with i)u* 
probe. 

'Hie thcnjinl t-yrJIng cuiiditlotvs Ineluded 2 jiiln at 
SO^C and 10 niiti al 95"C. 'Ilie-rmal t-ycliiig proreeded with 



RIAL 1IML QUANinAIIVI' IXII 

r<>3ctioikK were performed in \h<* Morlnl 7700 Sequence 
Ut-lor Applied Ulusysunm.), mhlrh ct>iU»lm -a Ociti 
Amp Sy&tvni U*:a4:llon ctmditirm* w*-rf p\t 

grutiiiucU un ,1 I'wwwr Mncinti»U VlOCi (Apple C'^mpiifp 
Rama Qara, t:A) Jinked clirvtily 10 the Model '/70f> S 
qucitev lXil#ctor« Aiia1y»U uf data w»« alw,i performfH « 
the Mm-ln truth compvitor, Ctnlloetlon and »n*)lyclc cofi w;r 
Uvveki|wl hi I'K Applied MosyMums. 

Tran»fection of Cells with Factor VIII Coiu-truct 

Knur T17.S flasks of 293 cells (ATCO C1U. J 573), a hums 
fetol iddney mvpeuAlon cell line, were gmwii to R0% coi 
lUieney a Ad tranifcrted pl-WKi. Cell* were grown in (I 
following modi a: 50% HAM'S HI 2 without GUT, 50% In 
glucose ^uJhmNi's modified Ka^lo luedumi (OMIiM) wit] 
oin glydne, wiUi sodium bicjirlxmate, 10% Ictal txvviT 
serum, 2 him L-gluldiriinr, and 1% pcnicilltu-strcptonv 

The media yyo» eJ tanked 30 rnln hcA»«' *he Iransfc 
lion. plWM DNA amountu 0/ 40, 4, OS, and 0J pf> we 
itdital iu 1..S ml of a solution containing 0.125 m CmC 
*nd IX Itl'J'US. The fouf mixhm*s were left al room ten 

pc.nit«jrt* f<« M) mln ai>d theti ;kUUh1 HinpwlM> u> il»o cell 
'11 iv a-rtbk* wvi%* uiuiUkted ol 37°C'an<i 5.% CO s for 24 li 
washed with PUS, ««u;ipe.ndcd In Ptt$l. 'Hie 
jA'mkrd cclU were divided inlp mK<|u<>U und UMA wad C; 
rrneted Immcdluidy u»iik iheQlAump Ivlinui Kit (Qtagei 
CUj«tjiw»rtl) ( <.tA>» l>NA w<i.s duled Into 200 ^1 of SO m 
IVIa-llCJ ot pll 8.0, 
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WISP genes are members of the connective tissue growth factor 
family that are up-regulated in Wnt-1 -transformed cells and 
aberrantly expressed in human colon tumors 
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ABSTRACT Wnt family members are critical to many the nucleus and binds TCF/Lefl target DNA elements to 
developmental processes, and components of the Wnt signal- activate transcription (7, 8). Other experiments suggest that 
ing pathway have been linked to tumorigenesis in familial and the adenomatous polyposis coli (APC) tumor suppressor gene 
sporadic colon carcinomas. Here we report the identification also plays an important role in Wnt signaling by regulating 
of two genes, WISP-1 and WISP-2, that are up-regulated in the /3-catenin levels (9). APC is phosphorylated by GSK-3/3, binds 
mouse mammary epithelial cell line C57MG transformed by to j3-catenin, and facilitates its degradation. Mutations in 
Wnt-1, but not by Wnt-4. Together with a third related gene, cither APC or 0-catenin have been associated with colon 
WISP-3, these proteins define a subfamily of the connective carcinomas and melanomas, suggesting these mutations con- 
tribute to the development of these types of cancer, implicating 
the Wnt pathway in tumorigenesis (1). 

Although much has been learned about the Wnt signaling 
pathway over the past several years, only a few of the tran- 
scriptionally activated downstream components activated by 
Wnt have been characterized. Those that have been described 
cannot account for all of the diverse functions attributed to 
Wnt signaling. Among the candidate Wnt target genes are 
those encoding the nodal-related 3 gene, Xnr3 t a member of 
the transforming growth factor (TGF)-£ superfamily, and the 
homeobox genes, engrailed, goosecoid* twin (Xtwn), and siamois 
(2). A recent report also identifies c-myc as a target gene of the 
. Wnt signaling pathway (10). 

To identify additional downstream genes in the Wnt signal- 
ing pathway that are relevant to the transformed cell pheno- 
type, we used a PCR-based cDNA subtraction strategy, sup- 
pression subtractive hybridization (SSH) (11), using RNA 
isolated from C57MG mouse mammary epithelial cells and 
C57MG cells stably transformed by a Wnt-1 retrovirus. Over- 
expression of Wnt-1 in this cell line is sufficient to induce a 
partially transformed phenotype, characterized by elongated 
and retractile cells that lose contact inhibition and form a 
multilayered array (12, 13). We reasoned that genes differen- 
tially expressed between these two cell lines might contribute 
to the transformed phenotype. 

In this paper, we describe the cloning and characterization 
of two genes up-regulated in Wnt-1 transformed cells, WISP-1 
and WISP-2, and a third related gene, WISPS. The WISP genes 
are members of the CCN family of growth factors, which 
includes connective tissue growth factor (CTGF), Cyr61, and 
nov, a family not previously linked to Wnt signaling. 

MATERIALS AND METHODS 

SSH. SSH was performed by using the PCR-Select cDNA * 
. Subtraction Kit (CLONTECH). Tester double-stranded 

Abbreviations: TGF, transforming growth factor, CTGF, connective 
tissue growth factor; SSH, suppression subtractive hybridization; 
VWC, von Willebrand factor type C module. 
Data deposition: The sequences reported in this paper have been 
deposited in the Genbank database (accession nos. AF1 00777 
AF100778, AF100779, AF100780, and AF100781). 
tTo whom reprint requests should be addressed, e-mail: diane@gene. 
com. 




tissue growth factor family. Two distinct systems demon- 
strated WISP induction to be associated with the expression of 
Wnt-1. These included (i) C57MG cells infected with a Wnt-1 
retroviral vector or expressing Wnt-1 under the control of a 
tetracyline repress ible promoter, and (ii) Wnt-1 transgenic 
mice. The WISP-1 gene was localized to human chromosome 
8q24.1-8q24j. WISP-1 genomic DNA was amplified in colon 
cancer cell lines and in human colon rumors and its RNA 
overexpressed (2- to > 30-fold) in 84% of the tumors examined 
compared with patient-matched normal mucosa. WISP-3 
mapped to chromosome 6q22-6q23 and also was overex- 
pressed (4- to > 40-fold) in 63% of the colon tumors analyzed. 
In contrast, WISP-2 mapped to human chromosome 20ql2- 
20ql3 and its DNA was amplified, but RNA expression was 
reduced (2- to > 30-fold) in 79% of the tumors. These results 
suggest that the WISP genes may be downstream of Wnt-1 
signaling and that aberrant levels of WISP expression in colon 
cancer may play a role in colon tumorigenesis. 



Wnt-1 is a member of an expanding family of cysteine-rich, 
glycosylated signaling proteins that mediate diverse develop- 
mental processes such as the control of cell proliferation, 
adhesion, cell polarity, and the establishment of cell fates (1, 
2). Wnt-1 originally was identified as an oncogene activated by 
the insertion of mouse mammary tumor virus in virus-induced 
mammary adenocarcinomas (3, 4). Although Wnt-1 is not 
expressed in the normal mammary gland, expression of Wnt-1 
in transgenic mice causes mammary tumors (5). 

In mammalian cells, Wnt family members initiate signaling 
by binding to the seven-transmembrane spanning Frizzled 
receptors and recruiting the cytoplasmic protein Dishevelled 
(Dsn) to the cell membrane (1, 2, 6). Dsh then inhibits the 
kinase activity of the normally constituiively active glycogen 
synthase kinase-3/3 (GSK-3/3) resulting in .an increase in 
)3-catenin levels. Stabilized 0-catenin interacts with the tran- 
scription factor TCF/Lefl, forming a complex that appears in 



The publication costs of this article were defrayed in part by page charge 
payment. This article must therefore be hereby marked "advertisement" in 
accordance with 18 U.S.C. §1734 solely to indicate this fact 

'© 1998 by The National Academy of Sciences 0027-S424 /98/9514717-6S2.00/0 
PNAS is available online at www.pnas.org. 



14717 



14718 Cell Biology, Medical Sciences: Pennica et.al. 



Proc. Natl. Acad. Sci. USA 95 (1998.) 



cDNA was synthesized from 2 fig of poly(A) + RNA isolated 
from the C57MG/Wnt-1 cell line and driver cDNA from 2 
of poly(A)+ RNA from the parent C57MG cells. The sub- 
tracted cDNA library was subcloned into a pGEM-T vector for 
further analysis. 

cDNA Library Screening. Clones encoding full-length 
mouse WISP-1 were isolaLed by screening a AgtlO mouse 
embryo cDNA library (CLOHTECH) with a 70-bp probe from 
the original partial clone 568 sequence corresponding to amino 
acids 128-169. Clones encoding full-length human WISP-1 
were isolated by screening AgtlO lung and fetal kidney cDNA 
libraries with the same probe at low stringency. Clones en- 
coding full-length mouse and human WISP-2 were isolated by 
screening a C57MG /Wnt-1 or human fetal lung cDNA library 
with a probe corresponding to nucleotides 1463-1512. Full- 
length cDNAs encoding WISP-3 were cloned from human 
bone marrow and fetal kidney libraries. 

Expression of Human WISP RNA. PCR amplification of 
first-strand cDNA was performed with human Multiple Tissue 
cDNA panels (CLONTECH) and 300 of each dNTP at 
94°C for 1 sec, 62°C for 30 sec, 72°C for 1 min, for 22-32 cycles. 
WISP and glyceraldehyde-3-phosphate dehydrogenase primer 
sequences are available on request. 

In Situ Hybridization. 33 P-labeled sense and antisense ribo- 
probes were transcribed from an 897-bp PCR product corre- 
sponding to nucleotides 601-1440 of mouse WISP-1 or a 
294-bp PCR product corresponding to nucleotides 82-375 of 
mouse WISP-2. All tissues were processed as described (40). 

Radiation Hybrid Mapping. Genomic DNA from each 
hybrid in the Stanford G3 and Genebridge4 Radiation Hybrid 
Panels (Research Genetics, Huntsville, AL) and human and 
hamster control DNAs were PCR-amplified, and the results 
were submitted to the Stanford or Massachusetts Institute of 
Technology web servers. 

Cell Lines, Tumors, and Mucosa Specimens. Tissue speci- 
mens were obtained from the Department of Pathology (Uni- 
versity of Pittsburgh) for patients undergoing colon resection 
and from the University of Leeds, United Kingdom. Genomic 
DNA was isolated (Qiagen) from the pooled blood of 10 
normal human donors, surgical specimens, and the following 
ATCC human cell lines: SW480, COLO 320DM, HT-29, 
WiDr, and SW403 (colon adenocarcinomas), SW620 (lymph 
node metastasis, colon adenocarcinoma), HCT 116 (colon 
carcinoma), SK-CO-1 (colon adenocarcinoma, ascites), and 
HM7 (a variant of ATCC colon adenocarcinoma cell line LS 
174T). DNA concentration was determined by using Hoechst 
dye 33258 intercalation f luorimetry. Total RNA was prepared 
by homogenization in 7 M GuSCN followed by centrifugation 
over CsCl cushions or prepared by using RNAzol. 

Gene Amplification and RNA Expression Analysis. Relative 
gene amplification and RNA expression of WISPs and c-myc in 
the cell lines, colorectal tumors, and normal mucosa were 
determined by quantitative PCR. Gene-specific primers and 
fluorogenic probes (sequences available on request) were 
designed and used to amplify and quantitate the genes. The 
relative gene copy number was derived by using the formula 
2( Act ) where ACt represents the differenpe in amplification 
cycles required to detect the WISP genes in peripheral blood 
lymphocyte DNA compared with colon tumor DNA or colon 
tumor RNA compared with normal mucosal RNA. The 
d-method was used for calculation of the SE of the gene copy 
number or RNA expression level. The WlSP-sptcific signal was 
normalized to that of the giyceraldehyde-3-phosphate dehy- 
drogenase housekeeping gene. All TaqMan assay reagents 
were obtained from Perkin-Elmer Applied Biosystems. 

RESULTS 

Isolation of WISP-1 and WISP-2 by SSH. To identify Wnt- 
1-inducible genes, we used the technique of SSH using the 



mouse mammary epithelial cell line C57MG and C57MG cells 
that stably express Wnt-1 (11). Candidate differentially ex* 
pressed cDNAs (1,384 total) were sequenced. Thirty-nine 
percent of the sequences matched known genes or homo- 
logues, 32% matched expressed sequence tags, and 29% had 
no match. To confirm that the transcript was differentially 
expressed, semiquantitative reverse transcription-PCR and 
Northern analysis were performed by using mRNA from the 
C57MG and C57MG/Wnt-1 cells. 

Two of the cDNAs, WISP-1 and WISP-2, were differentially 
expressed, being induced in the C57MG/Wnt-1 cell line, but 
not in the parent C57MG cells or C57MG cells overexpressing 
Wnt-4 (Fig. 1 A and 5). Wnt-4, unlike Wnt-1, does not induce 
the morphological transformation of C57MG cells and has no 
effect on 0-catenin levels (13, 14). Expression of WISP-1 was 
up-regulated approximately 3-fold in the C57MG/Wnt-1 cell 
line and WISP-2 by approximately 5-fold by both Northern 
analysis and reverse transcription-PCR. 

An independent, but similar, system was used to examine 
WISP expression after Wnt-1 induction. C57MG cells express- 
ing the Wnt-1 gene under the control of a tetracycline- 
repressible promoter produce low amounts of Wnt-1 in the 
repressed state but show a strong induction of Wni-1 mRNA 
and protein within 24 hr after tetracycline removal (8). The 
levels of Wnt-1 and WISP RNA isolated from these cells at 
various times after tetracycline removal were assessed by 
quantitative PCR. Strong induction of Wnt-1 mRNA was seen 
as early as 10 hr after tetracycline removal. Induction of WISP 
mRNA (2- to 6-fold) was seen at 48 and 72 hr (data not shown). 
These data support our previous observations that show that 
WISP induction is correlated with Wnt-1 expression. Because 
the induction is slow, occurring after approximately 48 hr, the 
induction of WISPs may be an indirect response to Wnt-1 
signaling. 

cDNA clones of human WISP-1 were isolated and the 
sequence compared with mouse WISP-1. The cDNA sequences 
of mouse and human WISP-1 were 1,766 and 2,830 bp in length, 
respectively, and encode proteins of 367 aa, with predicted 
relative molecular masses of 40,000 (Af T 40 K). Both have 
hydrophobic N-terminal signal sequences, 38 conserved cys- 
teine residues, and four potential N-Hnked glycosylation sites 
and are 84% identical (Fig. Z4). 

Full-length cDNA clones of mouse and human WISP-2 were 
1,734 and 1,293 bp in length, respectively, and encode proteins 
of 251 and 250 aa, respectively, with predicted relative molec- 
ular masses of 27,000 {M r 27 K) (Fig. IB). Mouse and human 
WISP-2 are 73% identical. Human WISP-2 has no potential 
N-linked glycosylation sites, and mouse WISP-2 has one at 
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Fig. i. WISP-1 and WISP-2 are induced by Wnt-1, but not Wnt-4, 
expression in C57MG cells. Northern analysis of WISP-1 (A) and 
WISP-2 (B) expression in C57MG, C57MG/WnM, and C57MG/ 
Wnt-4 cells. Poly(A) + RNA (2 fig) was subjected to Northern blot 
analysis and hybridized with a 70-bp mouse WISP- 1 -specific probe 
(amino acids 278-300) or a 190-bp WlSP-2-spec\ftc probe (nucleotides 
1438-1627) in the 3' untranslated region. Blots were rehybridized with 
human j3-actin probe. 
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Fig. 2. Encoded amino acid sequence alignment of mouse and 
human WTSP-1 (A) and mouse and human WISP-2 (B). The potential 
signal sequence, insulin-like growth factor-binding protein (IGF-BP), 
VWC, thrombospondin (TSP), and C-terminal (CT) domains are 
underlined. 

position 197. WISP-2 has 28 cysteine residues that are con- 
served among the 38 cysteines found in WISP-1. 

Identification of WISP~3. To search for related proteins, we 
screened expressed sequence tag (EST) databases with the 
WISP-1 protein sequence and identified several ESTs as 
potentially related sequences. We identified a homologous 
protein that we have called WISP-3. A full-length human 
WlSP-3 cDNA of 1,371 bp was isolated corresponding to those 
ESTs that encode a 354raa protein with a predicted molecular 
mass of 39,293. WlSP-3 has two potential N-linked glycosyl^ 
ation sites and 36 cysteine residues. An alignment of the three 
human WISP proteins shows that WISP-1 and WlSP-3 are the 
most similar (42% identity), whereas WISP-2 has 37% identity 
with WISP-1 and 32% identity with WISP-3 (Fig. 3/4). 

WJSPs Are Homologous to the CTGF Family of Proteins. 
Human. WISP-1, WISP-2, and WISPS are novel sequences; 
however, mouse WISP-1 is the same as the recently identified 
Elml gene. Elml is expressed in low, but not high, metastatic 
mouse melanoma cells, and suppresses the in vivo growth and 
metastatic potential of K-1735 mouse melanoma cells (15). 
Human and mouse WISP-2 are homologous to the recently 
described rat gene, rCop-1 (16). Significant homology (36- 
44%) was seen to the CCN family of growth factors. This family 
includes three members, CTGF, Cyr61, and the protoonco- 
gene nov. CTGF is a chemotactic and mitogenic factor for 
fibroblasts that is implicated in wound healing and fibrotic 
disorders and is induced by TGF-/3 (17). Cyr61 is an extracel- 
lular matrix signaling molecule that promotes cell adhesion, 
proliferation, migration, angiogenesis, and tumor growth (18, 
19). nov (nephroblastoma overexpressed) is an immediate 
early gene associated with quiescence and found altered in 
Wilms tumors (20). The proteins of the CCN family share 
functional, but not sequence, . similarity to WnM. All are 
secreted, cysteine-rich 1 heparin binding glycoproteins that as- 
sociate with the cell surface and extracellular matrix. 

WISP proteins exhibit the modular architecture of the CCN 
family, characterized by four conserved cysteine-rich domains 
(Fig. 3B) (21), The N-terminal domain, which includes the first- 
12 cysteine residues, contains a consensus sequence (GCGC- 
CXXC) conserved in most insulin-like growth factor* (IGF)- 
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Fig. 3. (A) Encoded amino acid sequence alignment of human 
WISPs. The cysteine residues of WISP-1 and WISP-2 that are not 
present in WISP-3 are indicated with a dot. (B) Schematic represen- 
tation of the WISP proteins showing the domain structure and cysteine 
residues (vertical lines). The four cysteine residues in the VWC domain 
that are absent in WISP-3 are indicated with a dot. (C) Expression of 
WISP mRNA in human tissues. PCR was performed on human 
multiple-tissue cDNA panels (CLONTECH) from the indicated adult 
and fetal tissues. 

binding proteins (BP). This sequence is conserved in WISP-2 
and WISP-3, whereas WISP-1 has a glutamine in the third 
position instead of a glycine. CTGF recently has been shown 
to specifically bind IGF (22) and a truncated nov protein 
lacking the IGF-BP domain is oncogenic (23). The von Wil- 
lebrand factor type C module (VWC), also found in certain 
collagens and mucins, covers the next 10 cysteine residues, and 
is thought to participate in protein complex formation and 
oligomerization (24). The VWC domain of WISP-3 differs 
from all CCN family members described previously, in that it 
contains only six of the 10 cysteine residues (Fig. 3 A and B). 
A short variable region follows the VWC domain. The third 
module, the thrombospondin (TSP) domain is involved in 
binding to sulfated glycoconju gates and contains six cysteine 
residues and a conserved WSxCSxxCG motif first identified in 
thrombospondin (25). The C-terminal (CT) module contain- 
ing the remaining 10 cysteines is thought to be involved in 
dimerization and receptor binding (26). The CT domain is 
present in all CCN family members described to date but is 
absent in WISP-2 (Fig. 3 A and B). The existence of. a putative 
signal sequence and the absence of a transmembrane domain 
suggest that WISPs are secreted proteins, an observation 
supported by an analysis of their expression and secretion from 
mammalian cell and baculovirus cultures (data not shown). 

Expression of WISP mRNA in Human Tissues. Tissue- 
specific expression of human WISPs was characterized by PCR 
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analysis on adult and fetal multiple tissue cDNA panels. 
WISP-1 expression was seen in the adult heart, kidney, lung, 
pancreas, placenta, ovary, small intestine, and spleen (Fig. 3C). 
Little or no expression was detected in the brain, liver, skeletal 
muscle, colon, peripheral blood leukocytes, prostate, testis, or 
thymus. WISP-2 had a more restricted tissue expression and 
was detected in adult skeletal muscle, colon, ovary, and fetal 
lung. Predominant expression of WISPS was seen in adult 
kidney and testis and fetal kidney. Lower levels of WISPS 
expression were detected in placenta, ovary, prostate, and 
small intestine. 

In Situ Localization of WISP-1 and WISP-2. Expression of 
WISP-1 arid WISP-2 was assessed by in situ hybridization in 
mammary tumors from Wnt-1 transgenic mice. Strong expres- 
sion of WISP-1 was observed in stromal fibroblasts lying within 
the fibrovascular tumor stroma (Fig. 4 A-D). However, low- 
level WISP-1 expression also was observed focally within tumor 
cells (data not shown). No expression was observed in normal 
breast. Like WISP-1 , WISP-2 expression also was seen in the 
tumor stroma in breast tumors from Wnt-1 transgenic animals 
(Fig. 4 E-H). However, WISP-2 expression in the stroma was 
in spindle-shaped cells adjacent to capillary vessels, whereas 
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Fig. 4. (A, C, £, and G) Representative hematoxylin/eosin-stained 
images from breast tumors in Wnt-1 transgenic mice. The correspond- 
ing dark-field images showing WISP-1 expression are shown in B and 
D. The tumor is a moderately well-differentiated adenocarcinoma 
showing evidence of adenoid cystic change. At low power {A and B), 
expression of WISP-1 is seen in the delicate branching fibrovascular 
tumor stroma (arrowhead). At higher magnification, expression is seen 
in the stromal(s) fibroblasts (C and O), and tumor cells are negative. 
Focal expression of WISP-1, however, was observed in tumor cells in 
some areas. Images of WISP-2 expression are shown in E-H. At low 
power (£ and F),. expression of WISP-2 is seen in cells lying within the 
fibrovascular tumor stroma. At higher magnification, these cells 
appeared to be adjacent to capillary vessels whereas tumor cells are 
negative (G and H). 



the predominant cell type expressing WISP-1 was the- stromal 
fibroblasts. 

Chromosome Localization of the WISP Genes. The chro- 
mosomal location of the human WISP genes was determined 
by radiation hybrid mapping panels. WISP-1 is approximately 
3.48 cR from the meiotic marker AFM259xc5 [logarithm of 
odds (lod) score 16.31] on chromosome 8q24.1 to 8q24.3, in the 
same region as the human locus of the novH family member 
(27) and roughly 4 Mbs distal to c-myc (28). Preliminary fine 
mapping indicates that WISP-1 is located near D8S1712 STS. 
WISP-2 is linked to the marker SHGC-33922 (lod = 1,000) on 
chromosome 20ql2-20ql3.1. Human WISPS mapped to chro- 
mosome 6q22-6q23 and is linked to the marker AFM211ze5 
(lod = 1,000). WISPS is approximately 18 Mbs proximal to 
CTGF and 23 Mbs proximal to the human cellular oncogene 
MYB (27, 29). 

Amplification and Aberrant Expression otWISPs in Human 
Colon Tumors. Amplification of protooncogenes is seen in 
many human tumors and has etiological and prognostic sig* 
nificance. For example, in a variety of tumor types, c-myc 
amplification has been associated with malignant progression 
and poor prognosis (30). Because WISP-1 resides in the same 
general chromosomal location (8q24) as c-myc, we asked 
whether it was a target of gene amplification, and, if so, 
whether this amplification was independent of the c-myc locus. 
Genomic DNA from human colon cancer cell lines was 
assessed by quantitative PCR and Southern blot analysis. (Fig. 
5 A and 5). Both methods detected similar degrees of WISP-1 
amplification. Most cell lines showed significant (2- to 4-fold) 
amplification, with the HT-29 and WiDr cell lines demonstrat- 
ing an 8-fold increase. Significantly, the pattern of amplifica- 
tion observed did not correlate with that observed for c-myc, 
indicating that the c-myc gene is not part of the amplicon that 
involves the WISP-1 locus. 

We next examined whether the WISP genes were amplified 
in a panel of 25 primary human colon adenocarcinomas. The 
relative WISP gene copy number in each colon tumor DNA 
was compared with pooled normal DNA from 10 donors by 
quantitative PCR (Fig. 6). The copy number of WISP-1 and 
WISP-2 was significantly greater than one, approximately 
2-fold for WISP-1 in about 60% of the tumors and 2- to 4-fold 
for WISP-2 in 92% of the tumors {P < 0.001 for each). The 
copy number for WISPS was indistinguishable from one (P = 
0.166). In addition, the copy number of WISP-2 was signifi- 
cantly higher than that of WISP-1 (P < 0.001). 

The levels of WISP transcripts in RNA isolated from 19 
adenocarcinomas and their matched normal mucosa were 
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Fig. 5. Amplification of WISP-1 genomic DNA in colon cancer cell 
lines. (A) Amplification in cell line DNA was determined by quanti- 
tative PCR. (5) Southern blots containing genomic DNA (10 /tg) 
digested with EcoRl (WISP-1) or Xbal (c-myc) were hybridized with 
a 100-bp human WISP-1 probe (amino acids 186-219) or a human 
c-myc probe (located at bp 1901-2000). The WISP and myc genes are 
detected in normal human genomic DNA after a longer film exposure. 
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Fig. 6. Genomic amplification of WISP genes in human colon 
tumors. The relative gene copy number of the WISP genes in 25 
adenocarcinomas was assayed by quantitative PCR, by comparing 
DNA from primary human tumors with pooled DNA from 10 healthy 
donors. The data are means ± SEM from one experiment done in 
triplicate. The experiment was repeated at least three times. 

assessed by quantitative PCR (Fig. 7). The level of WISP-1 
RNA present in tumor tissue varied but was significantly 
increased (2- to >25-fold) in 84% (16/19) of the human colon 
tumors examined compared with normal adjacent mucosa. 
Four of 19 tumors showed greater than 10-fold overexpression. 
In contrast, in 79% (15/19) of the tumors examined, WISP-2 
RNA expression was significantly lower in the tumor than the 
mucosa. Similar to WISP-1, WISPS RNA was overexpressed in 
63% (12/19) of the colon tumors compared with the normal 
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Fig. 7. WISP RNA expression in primary human colon tumors 
relative to expression in normal mucosa from the same patient 
Expression of WISP mRNA in 19 adenocarcinomas was assayed by 
quantitative PCR. The Dukes stage of the tumor is listed under the 
sample number. The data are means ± SEM from one experiment 
done in triplicate. The experiment was repeated at least twice. 
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mucosa. The amount of overexpression of W1SP-3 ranged from 
4- to >40-fold. 



DISCUSSION 

One approach to understanding the molecular basis of cancer 
is to identify differences in gene expression between cancer 
cells and normal cells. Strategies based on assumptions that 
steady-state mRNA levels will differ between normal and' 
malignant cells have been used to clone differentially ex- 
pressed genes (31). We have used a PCR-based selection 
strategy, SSH, to identify genes selectively expressed in 
C57MG mouse mammary epithelial cells transformed by 
Wnt-1. 

Three of the genes isolated, WISP-1, WISP-2, and WISP-3, 
are members of the CCN family of growth factors, which 
includes CTGF, Cyr61, and nov, a family not previously linked 
to Wnt signaling. 

Two independent experimental systems demonstrated that 
WISP induction was associated with the expression of Wnt-1. 
The first was C57MG cells infected with a Wnt-1 retroviral 
vector or C57MG cells expressing Wnt-1 under the control of 
a tetracyline-repressible promoter, and the second was in 
Wnt-1 transgenic mice, where breast tissue expresses Wnt-1, 
whereas normal breast tissue does not. No WISP RNA expres- 
sion was detected in mammary tumors induced by polyoma 
virus middle T antigen (data not shown). These data suggest 
a link between Wnt-1 and WISPs in that in these two situations, 
WISP induction was correlated with Wnt-1 expression. 

It is not clear whether the WISPs are directly or indirectly 
induced by the downstream components of the Wnt-1 signaling 
pathway (i.e., 0-catenin-TCF-l/Lefl). The increased levels of 
WISP RNA were measured in Wnt-1 -transformed ceils, hours 
or days after Wnt-1 transformation. Thus, WISP expression 
could result from Wnt-1 signaling directly through /3-catenin 
transcription factor regulation or alternatively through Wnt-1 
signaling turning on a transcription factor, which in turn 
regulates WISPs. 

The WISPs define an additional subfamily of the CCN family 
of growth factors. One striking difference observed in the 
protein sequence of WISP-2 is the absence of a CT domain, 
which is present in CTGF, Cyr61, nov, WISP-1, and W1SP-3. 
This domain is thought to be involved in receptor binding and 
dimerization. Growth factors, such as TGF-j3, platelet-derived 
growth factor, and nerve growth factor, which contain a cystine 
knot motif exist as dimers (32). It is tempting to speculate that 
WISP-1 and WISP-3 may exist as dimers, whereas WISP-2 
exists as a monomer. If the CT domain is also important for 
receptor binding, WISP-2 may bind its receptor through a 
different region of the molecule than the other CCN family 
members. No specific receptors have been identified for CTGF 
or nov, A recent report has shown that integrin Oy/fe serves as 
an adhesion receptor for Cyr61 (33). 

The strong expression of WISP-1 and WISP-2 in cells lying 
within the fibrovascular tumor stroma in breast tumors from 
Wnt-1 transgenic animals is consistent with previous obser- 
vations that transcripts for the related CTGF gene are pri- 
marily expressed in the fibrous stroma of mammary tumors 
(34). Epithelial cells are thought to control the proliferation of 
connective tissue stroma in mammary tumors by a cascade of 
growth factor signals similar to that controlling connective 
tissue formation during wound repair. It has been proposed 
that mammary tumor cells or inflammatory cells at the tumor 
interstitial interface secrete TGF-01, which is the stimulus for 
stromal proliferation (34). TGF-j31 is secreted by a large 
percentage of malignant breast tumors and may be one of the 
growth factors that stimulates the production of CTGF and 
WISPs in the stroma. 

It was of interest that WISP-1 and WISP-2 expression was 
observed in the stromal cells that surrounded the tumor cells 
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(epithelial cells) in the Writ-1 transgenic mouse sections of 
breast tissue. This finding suggests that paracrine signaling 
could occur in which the stromal cells could supply WTSP-1 and 
WISP-2 to regulate tumor cell growth on the WISP extracel- 
lular matrix. Stromal cell -derived factors in the extracellular 
matrix have been postulated to play a role in tumor cell 
migration and proliferation (35). The localization of WISP-1 
and WISP-2 in the stromal cells of breast tumors supports this 
paracrine model. 

An analysis of WISP-1 gene amplification and expression in 
human colon tumors showed a correlation between DNA 
amplification and overexpression; whereas overexpression of 
WISP-3 RNA was seen in the absence of DNA amplification. 
In contrast, WISP-2 DNA was amplified in the colon tumors, 
but its mRNA expression was significantly reduced in the 
majority of tumors compared with the expression, in normal 
colonic mucosa from the same patient. The gene for human 
WISP-2 was localized to chromosome 20ql2-20ql3, at a region 
frequently amplified and associated with poor prognosis in 
node negative breast cancer and many colon cancers, suggest- 
ing the existence of one or more oncogenes at this locus 
(36-38). Because the center of the 20ql3 amplicon has not yet 
been identified, it is possible that the apparent amplification 
observed for WISP-2 may be caused by another gene in this 
amplicon. 

A recent manuscript on r€op-l t the rat orthologue of 
WISP-2, describes the loss of expression of this gene after cell 
transformation, suggesting it may be a negative regulator of 
growth in cell lines (16). Although the mechanism by which 
WISP-2 RNA expression is down-regulated during malignant 
transformation is unknown, the reduced expression of WISP-2 
in colon tumors and cell lines suggests that it may function as 
a tumor suppressor. These results show that the WISP genes 
are aberrantly expressed in colon cancer and suggest that their 
altered expression may confer selective growth advantage to 
the tumor. 

Members of the Wnt signaling pathway have been impli- 
cated in the pathogenesis of colon cancer, breast cancer, and 
melanoma, including the tumor suppressor gene adenomatous 
polyposis.coli and /3-catenin (39). Mutations in specific regions 
of either gene can cause the stabilization and accumulation of 
cytoplasmic /3-catenin, which presumably contributes to hu- 
man carcinogenesis through the activation of target genes such 
as the WISPs. Although the mechanism by which Wnt-1 
transforms cells and induces tumorigenesis is unknown, the 
identification of WISPs as genes that may be regulated down- 
stream of Wnt-1 in C57MG cells suggests they could be 
important mediators of Wht-1 transformation. The amplifica- 
tion and altered expression patterns of the WISPs in human 
colon tumors may indicate an important role for these genes 
in tumor development. . 
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methods. 'Peptides AENKor AEQK were' dissolved in water, made isotonic with 
NaQ and diluted into RPMI growth medium. T- cell -proliferation assays were 
done essentially as described 20,2 '. Briefly, after antigen pulsing (30u.gmr' 
TTCF) with tetrapeptides (l-2mgmr'), PBMO or EBV-B cells were 
washed in PBS and fixed for 45 s in 0.05% glutaraldehyde. Glycine was added 
to a final concentration of 0.1 M and the cells were washed five times in RPMI 
1640 medium containing 1% FCS before co-culture with T-cell clones in 
round-bottom 96-well microtitre plates. After 48 h, the cultures were pulsed 
with 1 u-Ci of 3 H- thymidine and harvested for scintillation counting 16 h later, 
Predigestion of native TTCF was done by incubating 200 u,g TTCF with 0.25 u,g 
pig kidney legumain in 500 u.1 50 mM citrate buffer, pH 5.5, for 1 h at 37 °C. 
Glycopeptide digestions. The peptides HIDNEEDI, HlDN(N-glucosamine) 
EEDI' and HIDNESD1, which are based on the TTCF sequence, and 
QQQHLFGSNVTDCSGNFCLFR(KKK), which is based on human transferrin, 
were obtained by custom synthesis. The three C-terminal lysine residues were 
added to the natural sequence to aid solubility. The transferrin glycopeptide 
QQQHLFGSNVTDCSGNFCLFR was prepared by tryptic (Promega) digestion 
of 5mg reduced, carboxy- methylated human transferrin followed by 
concanavalin A chromatography 11 . Glycopeptides corresponding to residues 
622-642 and 421-452 were isolated by reverse-phase HPLC and identified by 
mass spectrometry and N- terminal sequencing. The lyophilized transferrin - 
derived peptides were redissolved in 50'mM sodium acetate, pH 5.5, lOmM 
dithiothreitol, 20% methanol. Digestions were performed for 3 h at 30 °C with 
5-50 mU ml" 1 pig kidney legumain or B-cell AEP. Products were analysed by 
HPLC or MALDI-TOF mass spectrometry using a matrix of 10 mgrnl -1 a- 
cyanocinnamic acid in 50% acetonitrile/0.1% TFA and a PerSeptive Biosystems 
Elite STR mass spectrometer set to linear or reflector mode. Internal standar- 
dization was obtained with a matrix ion of 568.13 mass units. 
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Fas ligand (FasL) is produced by activated T cells and natural 
killer cells and it induces apoptosis (programmed cell death) in 
target cells through the death receptor Fas/Apol/CD95 (ref. 1). 
One important role of FasL and Fas is to mediate immune- 
cytotoxic killing of cells that are potentially harmful to the 
organism, such as virus-infected or tumour cells 1 . Here we 
report the discovery of a soluble decoy receptor, termed decoy 
receptor 3 (DcR3), that binds to FasL and inhibits FasL-induced 
apoptosis. The DcR3 gene was amplified in about half of 35 
primary lung and colon tumours; studied, and DcR3 messenger 
RNA was expressed in malignant tissue. Thus, certain tumours 
may escape FasL-dependent immune-cytotoxic attack by expres- 
sing a decoy receptor that blocks FasL 

By searching expressed sequence tag (EST) databases, we identi- 
fied a set of related ESTs that showed homology to the tumour 
necrosis factor (TNF) receptor (TNFR) gene superfamily 2 . Using 
the overlapping sequence, 'we isolated a previously unknown full- 
length complementary DNA from human fetal lung. We named the 
protein encoded by this cDNA decoy receptor 3 (DcR3). The cDNA 
encodes a 300-amino-acid polypeptide that resembles members of 
the TNFR family (Fig. la): the amino terminus contains a leader 
sequence, which is followed by four tandem cysteine-rich domains 
(CRDs). Like one other TNFR homologue, osteoprotegerin (OPG) 3 , 
DcR3 lacks an apparent transmembrane sequence, which indicates 
that it may be a secreted, rather than a membrane- asscociated, 
molecule. We expressed a recombinant, histidine-tagged form of 
DcR3 in mammalian cells; DcR3 was secreted into the cell culture 
medium, and migrated on polyacrylamide gels as a protein of 
relative molecular mass 35,000 (data not shown). DcR3 shares 
sequence identity in particular with OPG (31%) and TNFR2 
(29%), and has relatively less homology with Fas (17%). All of 
the cysteines in the four CRDs of DcR3 and OPG are conserved; 
however, the carboxy-terminal portion of DcR3 is 101 residues 
shorter. 

We analysed expression of DcR3 mRNA in human tissues by 
northern blotting (Fig. lb). We detected a predominant 1.2-kilobase 
transcript in fetal lung, brain, and liver, and in.adult spleen, colon 
and lung. In addition, we observed relatively high DcR3 mRNA 
expression in the human colon carcinoma cell line SW480. 

To investigate potential ligand interactions of DcR3, we generated 
a recombinant, Fc-tagged DcR3 protein. We tested binding of 
DcR3-Fc to human 293 cells transfected. with individual TNF- 
family ligands, which are expressed as type 2 transmembrane 
proteins (these transmembrane proteins have their N termini in 
the cytosol). DcR3-Fc showed a significant increase in binding to 
cells transfected with FasL 4 (Fig. 2a), but not to cells transfected with 
TNF 5 , Apo2L/TRAIL 6 ' 7 , Apo3L/TWEAK w , or OPGL/TRANCE/ 
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RANKL IM2 (data not shown). DcR3-Fc immunoprecipitated shed 
FasL from FasL-transfected 293 cells (Fig. 2b) and purified soluble 
FasL (Fig. 2c), as did the Fc-tagged ectodomain of Fas but not 
TNFR1. Gel-filtration chromatography showed that DcR3-Fc and 
soluble FasL formed a stable complex (Fig. 2d). Equilibrium 
analysis indicated that DcR3-Fc and Fas-Fc bound to soluble 
FasL with a comparable affinity ■ (iC d = 0.8 ± 0.2 and 
1.1 ±0.1nM, respectively; Fig. 2e), and that DcR3-Fc could 
block nearly all of the binding of soluble FasL to Fas-Fc (Fig. 2e, 
inset). Thus» DcR3 competes with Fas for binding to FasL. 

To determine whether binding of DcR3 inhibits FasL activity, we 
tested the effect of DcR3-Fc on apoptosis induction by soluble 
FasL in Jurkat T leukaemia cells, which express Fas (Fig. 3a). DcR3- 
Fc and Fas-Fc blocked soluble- FasL-induced apoptosis in a 
similar dose-dependent manner, with half-maximal inhibition at 
— 0.1 u-g ml"'. Time-course analysis showed that the inhibition did 
not merely delay cell death, but rather persisted for at least 24 hours 
(Fig. 3b). We also tested the effect of DcR3-Fc on activation- 
induced cell death (AICD) of' mature T lymphocytes, a FasL- 
dependent process 1 . Consistent with previous results 13 , activation 
of interleukin- 2 -stimulated CD4-positive T cells with anti-CD3 
antibody increased the level of apoptosis twofold, and Fas-Fc 
blocked this effect substantially (Fig. 3c); DcR3-Fc blocked the 



induction of apoptosis to a similar extent. Thus, DcR3 binding 
blocks apoptosis induction by FasL. 

FasL-induced apoptosis is important in elimination of virus- 
infected cells and cancer cells by natural killer cells and cytotoxic T 
lymphocytes; an alternative mechanism involves perforin and 
granzymes I,M "'\ Peripheral blood natural killer cells triggered 
marked cell death in Jurkat T" leukaemia cells (Fig. 3d); DcR3-Fc 
and Fas-Fc each reduced killing of target cells from —65% to 
-30%, with half-maximal inhibition at -1 M-gmT 1 ; the residual 
killing was probably mediated by the perforin/granzyme pathway. 
Thus, DcR3 binding blocks FasL-dependent natural killer cell 
activity. Higher DcR3-Fc and Fas-Fc concentrations were required 
to block natural killer cell activity compared with those required to 
block soluble FasL activity, which is consistent with the greater 
potency of membrane-associated FasL compared with soluble 
FasL 17 . 

Given the role of immune-cytotoxic cells in elimination of 
tumour cells and the fact that DcR3 can act as an inhibitor of 
FasL, we proposed that DcR3 expression might contribute to the 
ability of some tumours to escape immune-cytotoxic attack. As 
genomic amplification frequently contributes to tumorigenesis, we 
investigated whether the DcR3 gene is amplified in cancer. We 
analysed DcR3 gene -copy number by quantitative polymerase chain 
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Rgure 1 Primary structure and expression of human DcR3. a, Alignment of the 
amino-acid sequences of DcR3 and of osteoprotegerin (OPG); the Oterminal 101 
residues of OPG are not shown. The putative signal cleavage site (arrow), the 
cysteine-rich domains (CRD 1-4), and the A/-linked gfycosylation site (asterisk) are 
shown, b, Expression of DcR3 mRNA. Northern hybridization analysis was done 
using the DcR3 cONA as a probe and blots of pofytA)* RNA (Clontech) from 
human fetal and adult tissues or cancer ceil lines. PBL peripheral blood 
lymphocyte. 



Figure 2 Interaction of OcR3 with FasL a, 293 cells were transfected with pRK5 
vector (top) or with pRK5 encoding full-length FasL (bottom), incubated with 
DcR3-Fc (solid line, shaded area), TNFR1-Fc (dotted line) or buffer control 
(dashed line) (the dashed and dotted" lines overlap), and analysed for binding by 
FACS-. Statistical analysis showed a signih'cant difference (P < 0.001 ) between the 
binding of DcR3-Fc to cells transfected with FasL or pRK5. PE, phycoerythrin- 
labelled cells, b, 293 cells were transfected as in a and metabolically labelled, and 
cell supernatants were immunoprecipitated with Fc-tagged TNFRl, DcR3 or Fas. 
c. Purified soluble FasL (sFasL) was immunoprecipitated with TNFR1-Fc, 0cR3- 
Fc or Fas-Fc and visualized by immunoblot with anti-FasL antibody. sFasL was 
loaded directly for comparison in the right-hand lane, d, Flag-tagged sFasL was 
incubated with 0cR3 r Fc or with buffer and resolved by gel filtration; column 
fractions were analysed in an assay that detects complexes containing DcR3-Fc 
and sFasL-Flag. e, Equilibrium binding of DcR3-Fc or Fas-Fc to sFasL-Rag. 
Inset competition of DcR3-Fc with Fas-Fc for binding to sFasL-Flag. 
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reaction (PCR) 18 in genomic DNA from 35 primary lung and colon 
tumours, relative to pooled genomic DNA from peripheral blood 
leukocytes (PBLs) of 10 healthy donors. Eight of 18 lung tumours 
and 9 of 17 colon tumours showed DcR3 gene amplification, 
ranging from 2- to 18-fold (Fig. 4a, b). To confirm this result, we 
analysed the colon tumour DNAs with three more, independent sets 
of DcR3-based PCR primers and probes; we observed nearly the 
same amplification (data not shown). 

We then analysed DcR3 mRNA expression in primary tumour 
tissue sections by in situ hybridization. We detected DcR3 expres- 
sion in 6 out of 15 lung tumours, 2 out of 2 colon tumours, 2 out of 5 
breast tumours, and 1 out of 1 gastric tumour (data not shown). A 
section through a squamous-cell carcinoma of the lung is shown in 
Fig. 4c. DcR3 mRNA was localized to infiltrating malignant epithe- 
lium, but was essentially absent from adjacent stroma, indicating 
tumour-specific expression Although the individual tumour speci- 
mens that we analysed for mRNA expression and gene amplification 
were different, the in situ hybridization results are consistent with 
the finding that the DcR3 gene is amplified frequently in tumours. 
SW480 colon carcinoma cells, which showed abundant DcR3 
mRNA expression (Fig. lb), also had marked DcR3 gene amplifica- 
tion, as shown by quantitative PCR (fourfold) and by Southern blot 
hybridization (fivefold) (data not shown). 

If DcR3 amplification in cancer is functionally relevant, then 
DcR3 should be amplified more than neighbouring genomic 
regions that are not important for tumour survival. To test this, 



we mapped the human DcR3 gene by radiation-hybrid analysis; 
DcR3 showed linkage to marker AFM2 18xe7 (T160), which maps to 
chromosome position 20ql3. Next, we isolated from a bacterial 
artificial chromosome (BAC) library a human genomic clone that 
carries DcR3, and sequenced the ends of the clone s insert: We then 
determined, from the nine colon tumours that showed twofold or 
greater amplification of DcR3, the copy number of the DcR3- 
flanking sequences (reverse and forward) from the BAC, and of 
seven genomic markers that span chromosome 20 (Fig. 4d). The 
DcR3 -linked reverse marker showed an average amplification of 
roughly threefold, slightly less than the approximately fourfold 
amplification of DcR3; the other markers showed little or no 
amplification. These data indicate that DcR3 may be at the 'epi- 
centre* of a distal chromosome 20 region that is amplified in colon 
cancer, consistent with the possibility that DcR3 amplification 
promotes tumour survival. 

Our results show that DcR3 binds specifically to FasL and inhibits 
FasL activity. We did not detect DcR3 binding to several other TNF- 
ligand- family members; however, this does not rule out the possi- 
bility that DcR3 interacts with other ligands, as do some other 
TNFR family members, including OPG 2 * 19 . . 

FasL is important in regulating the immune response; however, 
little is known about how FasL function is controlled. One mechan- 
ism involves the molecule cFLIP, which modulates apoptosis signal- 
ling downstream of Fas 20 . A second mechanism involves proteolytic 
shedding of FasL from the cell surface 17 . DcR3 competes with Fas for 
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Figure 3 Inhibition of FasL activity by DcR3. a, Human Jurkat T leukaemia cells 
were incubated with Flag-tagged soluble FasL (sFasUSngmr 1 ) oligomerized 
with anti-Flag antibody (0.1 jigmr 1 ) in the presence of the proposed inhibitors 
DcR3-Fc, Fas-Fc or human IgGl arid assayed for apoptosis (mean * s.e.m. of 
triplicates), b. Jurkat cells were incubated with sFasL-Flag.plus anti-Flag antibody 
as in a, in presence of 1 m£ ml" 1 OcR3-Fc (rilled circles), Fas-Fc (open circles) or 
human IgG 1 (triangles), and apoptosis was determined at the indicated time 
points, c, Peripheral blood T cells were stimulated with PHA and interleukin-2, 
followed by control (white bars) or antl-CD3 antibody (filled bars), together with 
phosphate-buffered saline (PBS), human IgGl, Fas-Fc, orDcR3-Fc(l0jigmr l ). 
After 16 h, apoptosis of CD4* cells was determined (mean ± s.e.m. of results from 
five donors), d, Peripheral blood natural killer cells were incubated with 5 *Cr- 
labelled Jurkat cells in the presence of DcR3-Fc (filled circles), Fas-Fc (open 
circles) or human IgGl (triangles), and target-cell death was determined by 
release of s, Cr (mean ±: s.d. for two donors, each in triplicate).- 
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Figure 4 Genomic amplification of DcR3 in tumours, a, Lung cancers, comprising 
eight adenocarcinomas (c. d, f. g, h, j, k, r). seven squamous-cell carcinomas (a, e, 
m, n, o, p. q), one non-small-cell carcinoma (b), one small-cell carcinoma (i), and 
one bronchial adenocarcinoma (I). The data are means i s.d. of 2 experiments 
done in duplicate, b, Colon tumours, comprising 17 adenocarcinomas. Data are 
means ± s.e.m. of five experiments done in duplicate, c, in situ hybridization 
analysis of DcR3 mRNA expression in a squamous-cell carcinoma of the lung. A 
representative bright-held image (left) and the corresponding dark-field image 
(right) show DcR3 mRNA over infiltrating malignant epithelium (arrowheads). 
Adjacent non-malignant stroma (S), blood vessel (V) and necrotic tumour tissue 
(N) are also shown, d, Average amplification of DcR3 compared with amplifica- 
tion of neighbouring genomic regions (reverse and forward, Rev and Fwd), the 
DcR3-Unked marker T160, and other chromosome-20 markers, in the nine colon 
tumours showing DcR3 amplification of twofold or more (b). Data are from two 
experiments done in duplicate. Asterisk indicates P < 0.01 for a Student's r-test 
comparing each marker with DcR3. 
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FasL binding; hence, it may represent a third mechanism of 
extracellular regulation of FasL activity. A decoy receptor that 
modulates the function of the cytokine interleukin- 1 has been 
described 1 '. In addition, two decoy receptors that belong to the 
TNFR family, DcRl and DcR2, regulate the FasL-related apoptosis- 
inducin g molecule Apo2L 22 . Unlike DcRl and DcR2, which are 
membrane-associated proteins, DcR3 is directly secreted into the 
extracellular space. One other secreted TNFR-family member is 
OPG 3 , which shares greater sequence homology with DcR3 (31%) 
than do DcRl (17%). or DcR2 (19%); OPG functions as a third 
decoy for Apo2L 19 . Thus, DcR3 and OPG define a new subset of 
TNFR-family members that function as secreted decoys to mod- 
ulate ligands that induce apoptosis. Pox viruses produce soluble 
TNFR liomologues that neutralize specific TNF-family ligands, 
thereby modulating the antiviral immune response 2 . Our results 
indicate that a similar mechanism, namely, production of a soluble 
decoy receptor for FasL, may contribute to immune evasion by 
certain tumours. Q 

Methods 

Isolation of DcR3 cONA. Several overlapping ESTs in Gen Bank (accession 
numbers AA025672, AA025673 and W67560) and in Lifeseq™ (Incyte 
Pharmaceuticals; accession numbers 1339238, 1533571, 1533650, 1542861, 
1789372 and 2207027) showed similarity to members of the TNFR family. We 
screened human cDNA libraries by PGR with primers based on the region of 
EST consensus; fetal lung was positive for a product of the expected size. By 
hybridization to a PCR-generated probe based on the ESTs, one positive clone 
(DNA30 942) was identified. When searching for potential alternatively spliced 
forms of DcR3 that might encode a transmembrane protein, we isolated 50 
more clones; the coding regions of these clones were identical in size to that of 
the initial done (data not shown). 

Fc-fusion proteins (immunoadhesins). The entire DcR3 sequence, or the 
ectodomain of Fas or TNFR1. was fused to the hinge and Fc region of human 
IgGl, expressed in insect SF9 cells or in human 293 cells, and purified as 
described 13 . 

Fluorescence-activated cell sorting (FACS) analysts. We transfected 293 
cells using calcium phosphate or Effectene (Qiagen) with pRK5 vector or pRK5 
encoding full-length human FasL* (2 u,g), together with pRK5 encoding CrmA 
(2 (j-g) to prevent cell death. After 16 h, the cells were incubated with 
biotinylated DcR3-Fcor TNFRl-Fc and then with phycoerythrin-conjugated 
streptavidin (GibcoBRL), and were assayed by FACS. The data were analysed by 
Kolmogorov-Smirnov statistical analysis. There was some detectable staining 
of vector- transfected cells by DcR3-Fc; as these cells express little FasL (data 
not shown), it is possible that DcR3 recognized some other factor that is 
expressed constitu lively on 293 cells. 

Immunoprecipitation, Human 293 ceils were transfected as above, and 
metabolically labelled with [ 3S S]cysteine and [ 35 S j methionine (0.5 mCi; 
Amersham). After I6h of culture in the presence of z-VAD-fmk (10u.M), 
the medium was immunoprecipitated with DcR3— Fc, Fas-Fc or TNFRl-Fc 
(5 u,g), followed by protein A-Sepharose (Repligen). The precipitates were 
resolved by SDS-PAGE and visualized on a phosphorimager (Fuji BAS2000). 
Alternatively, purified, Flag-tagged soluble FasL (1 u-g) (Alexis) was incubated 
with each Fc-fusion protein (1 u.g), precipitated with protein A-Sepharose, 
resolved by SDS-PAGE and visualized by immunoblotting with rabbit anti- 
FasL antibody (Oncogene Research). 

Analysis of complex formation. Flag-tagged soluble FasL (25u.g) was 
incubated with bufFer or with DcR3-Fc (40 u.g) for 1.5 h at 24 °C. The reaction 
was loaded onto a Superdex 200 HR 10/30 column (Pharmacia) and developed 
with PBS; 0.6- ml fractions were collected. The presence of DcR3-Fc-FasL 
complex in each fraction was analysed by placing 100 p.) aliquots into microtitre 
wells precoated with anti-human IgG (Boehringer) to capture DcR3-Fc, 
followed by detection with biotinylated anti-Flag antibody Bio M2 (Kodak) and 
streptavidin -horseradish peroxidase (Amersham). Calibration of the column 
indicated an apparent relative molecular mass of the complex of 420K (data not 
shown), which is consistent with a stoichiometry of two DcR3-Fc homodimers 
to two soluble FasL homotrimers. 

Equilibrium binding analysis. Microtitre wells were coated with and -human 



IgG, blocked with 2% BSA in PBS. DcR3-Fc or Fas-Fc was added, followed by 
serially diluted Flag-tagged soluble FasL. Bound ligand was detected with anti- 
Flag antibody as above. In the competition assay, Fas-Fc was immobilized as 
above; and the wells were blocked with excess IgGl before addition of Flag- 
tagged soluble FasL plus DcR3-Fc 

T-cell AICO. CD3* lymphocytes were isolated from peripheral blood of 
individual donors using anti-CD 3 magnetic beads (Miltenyi Biotech), 
stimulated with phytohaemagglutinin (PHA; 2 u-gmT 1 ) for 24 h, and cultured 
in the presence of interleukin- 2 ( 100 U ml"') for 5 days. The cells were plated in 
wells coated with anti-CD3 antibody (Pharmingen) and analysed for apoptosis 
16 h later.by FACS analysis of annexin-V-binding of CD4* cells". 
Natural killer cell activity. Natural killer cells were isolated from peripheral 
blood of individual donors using anti-CD56 magnetic beads (Miltenyi 
Biotech), and incubated for 16 h with 51 Cr-loaded Jurkat cells at an effector- 
to-target ratio of 1:1 in the presence of DcR3-Fc Fas-Fc or human IgGl. 
Target-cell death was determined by release of ?*Cr in effector-target co- 
cultures relative to release of 51 Cr by detergent lysis of equal numbers of Jurkat 
cells. ; 

Gene-amplification analysis. Surgical specimens were provided by J. Kern 
(lung tumours) and P. Quirke (colon tumours). Genomic DNA was extracted 
(Qiagen) and the concentration was determined using Hoechst dye 33258 
intercalation fluorometry. Amplification was determined by quantitative PCR" 
using a TaqMan instrument (ABI). The method was validated by comparison of 
PCR and Southern hybridization data for the Myc and HER-2 oncogenes (data 
not shown). Gene-specific primers and fluorogenic probes were designed on 
the basis of the sequence of DcR3 or of nearby regions identified on a BAC 
carrying the human DcR3 gene; alternatively, primers and probes were based 
on Stanford Human Genome Center marker AFM218xe7 (T160), which is 
linked to DcR3 (likelihood score = 5.4), SHGC-36268 (T159), the nearest 
available marker which maps to —500 kilobases from T160, and five extra 
markers that span chromosome 20. The DcR3 -specific primer sequences were 
5'-Cl'lCl 1CGCGCACGCTG-3' and 5'-ATCACGCCGGCACCAG-3' and the 
fluorogenic probe sequence was 5'-(FAM-ACACGATGCGTGCTCCAAGCAG 
AAp-(TAMARA), where FAM is 5'*- fluorescein phosphoramidite. Relative 
gene-copy numbers were derived using the formula 2 UCT1 , where ACT is the 
difference in amplification cycles required to detect DcR3 in peripheral blood 
lymphocyte DNA compared to test DNA. 

Received 24 September; accepted 6 November 1998. 

I. 
2. 



3. 

4. 

5. 

6. 

7. 

8. 

9. 

10. 

II. 

12. 

13. 

14. 

15. 

16. 
17. 

18. 

19. 

20. 
21. 



Nagata, S. Apoptosis by death factor. CeU 88, 355-365 (1997). 

Smith, C A., Farrah, T. St Goodwin, R. G. The TNF receptor superfamily of cellular and viral proteins; 
activation, costimulation, and death. Celt 76, 9S9-962 ( 1994). ' 

Simonet, W. S. et al Osteoprotegerin: a novel secreted protein involved in the regulation of bone 
density. OH 89, 309-3 1 9 ( 1997). 

Suda, T„ Takahashi, T„ Coistein, P. 8t Nagata, S. Molecular doning and expression of Fas ligand, a 
novel member of the TNF family. OH 75, 1169-1178 (1993). 

Pennica, D. et al Human tumour necrosis factor precursor structure, expression and homology to 
tymphotoxin. Nantre 312, 724-729 (1984). 

Pitti. R. M. et al Induction of apoptosis by Apo-2 ligand, a new member of the tumor necrosis factor 
receptor family. /. Biol. Chan. 271, 12687-12690 ( 1996). 

Wiley, S. R. et oi Identification and chancteriution of a new member of the TNF family that induces 
apoptosis. Immunity 1, 673-682 (1995). 

Marsters, S. A. et al Identification of a ligand for the dcath-domain-contatning receptor Apo3 Cut 
Biol 8.525-528 (1998). 

Chicheportiche, Y. a al TWEAK, a new secreted Hgand in the TNF family that weakly induces 
apoptosis./. Biol Chem. 271,32401-32410(1997). 

Wong. B. R. et at. TRANCE is a novel ligand of the TNFR family that activates c-Jun-N-terminal kinase 
in T cells. /. Biol Chem. 272, 25190-25194 (1997). 

Anderson, D. M. et at. A homo log of the TNF receptor and its ligand enhance T-cell growth and 
dendritic-cell function. Nature 390, 175-179 (1997). 

Lacey. D. L et al. Osteoprotegerin ligand is a cytokine that regulates oneodast differentiation and 
activation.' Cell 93, 165-176(1998). 

Dhein, J., Walczak, H„ Baumler, C, Dcbatin, K. M. At Krammer, P. H. Autocrine T<ell suicide 
mediated by Apol/(Fas/CD95). Nature 373, 438-441 (1995). 

Arase. H.. Arase, N. tt Saito, T. Fas-mediated cytotoxicity by freshly isolated natural killer cells. / Em 

MetL 181, 1235- 1 238 (1995), ' 

Medvedev, A. E. a al Regulation of Fas and Fas ligand expression in NK cells by cytokines and the 

involvement of Fas ligand in NK/IAK ceU-medtated cytotoxicity. Cytokine 9, 394-404 (1997). 

Moretta. A. Mechanismi in cell-mediated cytotoxicity. Cell 90, 13-18 (1997). 

Tanaka, M., Itai, T, Adachi, M. & Nagata, S. Downregualtion of Fas ligand by shedding. Nature Med. 

4,31-36(1998). 

Gelmini, S. et al. Quantitative PCR-based homogeneous assay with fluorogenic probes to measure c* 
erbB-2 oncogene amplification, din. Chem. 43, 752-758 (1997). 

Emery, J. G. et al Osteoprotegerin is a receptor for the cytotoxic tigand TRAIL. /, Biol Chem 273, 
14363-14367 (1998). 

Walladi, D. Placing death under control. Nature 388, 123- 125 (1997). 

Collota, R et al Interleukin- 1 type II receptor, a decoy target for IL- 1 that is regulated by IL-4. Science 
261,472-475(1993). 



702 



Nature © Macmillan Publishers Ltd 1996 



NATUR£|VOL 396| 17 DECEMBER 1998 (www.nature.com 



( 



( 



letters to nature 



22. Ashke-nui. A. Sc Dixit, V. M. Death receptors: signaling and modulation. Science 281, 130S- 1308 
C1998J. 

23. Ashkcrua. A. 8c Chantow, S. M. Immunoadhesins as research toots and therapeutic agents. Curr, 
Opin. Immunol 9. 195-200(1997). 

24. Mirsten, 5. a at. Activstion of apoptosis by Apo-2 ligand is independent of FADD but blocked by 
CrmA.. Curr, BloL 6, 750-752 { 1 996). 

Acknowledgement*. We thank C Clark, D. Pennica and V. Dixit for comments, and J. Kern and P. Quirke 
for tumour specimens. ■ 

Corrcspon<lcnce and requests for materials should be addressed to Aj*u (e-mail: aa@gene.com). The 
Gen Bank accession number for the DcR3 cDNA sequence is AF 1044 19. 



Crystal structure of the 
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ABC transporters (also known as traffic ATPases) form a large 
family of proteins responsible for the translocation of a variety 
of compounds across membranes of both prokaryotes and 
eukaryotes 1 . The recently completed Escherichia coli genome 
sequence revealed that the largest family of paralogous E. coli 
proteins is composed of ABC transporters 2 . Many eukaryotic 
proteins of medical significance belong to this family, such as 
.the cystic fibrosis transmembrane conductance regulator (CFTR), 
the P-g]ycoprotein (or multidrug-resistance protein) and the 
heterodimeric transporter associated - with antigen processing 
(Tapl— Tap2). Here we report the crystal structure at 1.5 A resolu- 
tion of HisP, the ATP-binding subunit of the histidine permease, 
which is an ABC transporter from Salmonella typhimurium. We 
correlate the details of this structure with the biochemical, genetic 
and biophysical properties of the wild-type and several mutant 
HisP proteins. The structure provides a basis for understanding 
properties of ABC transporters and of defective CFTR proteins. 

ABC transporters contain four structural domains: two nucleo- 
tide-binding domains (NBDs), which are highly conserved 
throughout the family, and two transmembrane domains 1 . In 
prokaryotes these domains are often separate subunits which are 
assembled into a membrane-bound complex; in eukaryotes the 
domains are generally fused into a single polypeptide chain. The 
periplasmic histidine permease of S. typhimurium and £ coli 1 * 3 " 8 is a 
well-characterized ABC transporter that is a good model for this 
superfamily. It consists of a membrane-bound complex, HisQMP 2 , 
which comprises integral membrane subunits, HisQ and HisM, and 
two copies of HisP, the ATP-binding subunit. HisP, which has 
■properties intermediate between those of integral and peripheral 
membrane proteins 9 , is accessible from both sides of the membrane, 
presumably by its interaction with HisQ and HisM*. The two HisP 
subunits form a dimer, as shown by their cooperativity in ATP 
hydrolysis 5 , the requirement for both subunits to be present for 
activity*, and the formation of a HisP dimer upon chemical cross- 
linking. Soluble HisP also forms a dimer*. HisP has been purified 
and characterized in an active, soluble form 3 which can be recon- 
stituted into a fully active membrane-bound complex*. 

The overall shape of the crystal structure of the HisP monomer is 
that of an T with two thick arms (arm I and arm II); the ATP- 
binding pocket is near the end of arm I (Fig. 1). A six-stranded p- 
sheet (p3 andp8-(312) spans both arms of the L, with a domain of a 
a- plus fS-type structure (pi, P2, P4-P7, al and ct2) on one side 
(within arm I) and a domain of mosdy a-helices (a3-a9) on the 
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Figure 1 Crystal structure of HisP. a, View of the dimer along an axis 
perpendicular to its two-fold axis. The top and bottom of the dimer are suggested 
to face towards the periplasmic and cytoplasmic sides, respectively (see text). 
The thickness of arm II is about 25 A, comparable to that of membrane. o-Helices 
are shown in orange and 0-sheets in green, b, View along the two-fold axis of the 
HisP dimer, showing the relative displacement of the monomers not apparent in 
a. The p-strands at the dimer interface are labelled, c, View of one monomer from 
the bottom of arm I, as shown in a, towards arm II, showing the ATP-binding 
pocket a-c. The protein and the bound ATP are in 'ribbon' and 'ball-and-stick' 
representations, respectively. Key residues discussed in the text are indicated in 
c. These figures were prepared with MOLSCRIPT 2 * N, amino terminus; C, C 
terminus. 
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NOVEL APPROACH TO QUANTITATIVE POLYMERASE .CHAIN REACTION USING 
RJSAL-TIME DETECTION: APPLICATION TO THE DETECTION OF GENE 
AMPLIFICATION IN BREAST CANCER 

Ivan Bieche 1 - 2 , Martine Ouvt 1 , Marie-Helene Champeme 2 , Dominique Vidaud 1 , Rosette LidereaU 2 ' and Michel Vidaud'* 

} Laboratoire de Genetique Moleculaire, Faculte des Sciences Pharmaceutigues et Biologigues de Paris, Paris, France 
2 Laboratoire d'Oncogenetique, Centre Rene Huguenin, St-Cloud, France 



Gene amplification is a common event in the progression of 
human cancers, and amplified oncogenes have been shown to 
have diagnostic, prognostic and therapeutic relevance. A 
kinetic quantitative polymerase-cha in- reaction (PCR) method, 
based on fluorescent TaqMan methodology and a new instru- 
ment (ABI Prism 7700 Sequence Detection System) capable 
of measuring fluorescence in real-time, was used to quantify 
gene amplification in tumor DNA. Reactions are character- 
ized by the point during cycling when PCR amplification is still 
in the exponential phase, rather than the amount of PCR 
product accumulated after a fixed number of cycles. None of 
the. reaction components is limited during the exponential 
phase, meaning that values are highly reproducible in reac- 
tions starting with the same copy number. This greatly 
improves the precision of DNA quantification. Moreover, 
real-time PCR does not require post-PCR sample handling, 
thereby preventing potential PCR-product carry-over con- 
tamination; it possesses a wide dynamic range of quantifica- 
tion and results in much faster and higher sample throughput. 
The real-time PCR method, was used to develop and validate 
a simple and rapid assay for the detection and quantification 
of the 3 most frequently amplified genes (myc, ccndl and 
erbB2) in breast tumors. Extra copies of myc, ccndT and erbB2 
were observed in 10, 23 and 15%, respectively, of 108 breast- 
tumor DNA; the largest observed numbers of gene copies 
were 4.6, 18.6 and 15.1, respectively. These results correlated 
well with those of Southern blotting. The use of this new 
semi-automated technique will make molecular analysis of 
human cancers simpler and more reliable, and should find 
broad applications in clinical and research settings. Jnt. J 
Cancer 78:661-666, 1998. 
© J 998 miey^Liss. Inc. 

Gene amplification plays an important role in the pathogenesis 
of various solid tumors, including breast cancer, probably because 
over-expression of the amplified target genes confers a selective 
advantage. The first technique used to detect genomic amplification 
was cytogenetic analysis. Amplification of several chromosome 
regions,, visualized either as extrachromosomal double minutes 
(dmins) or as integrated homogeneously staining regions (HSRs), 
are among the main visible cytogenetic abnormalities in breast 
tumors. Other techniques such as comparative genomic hybridiza- 
tion (CGH) (Kallioniemi et a!., 1 994) have also been used in broad 
searches for regions of increased DNA copy numbers in tumor 
cells, and have revealed some 20 amplified chromosome regions in 
breast tumors. Positional cloning efforts are underway to identify 
the critical gene(s) in each amplified region. To date, genes known 
to be amplified frequently in breast cancers include mvc (8q24), 
ccnd\ ( 1 1 q 1 3), and erbBl (J 7ql 2-q2 1 ) (for review, see Bieche and 
Lidereau, 1995). 

Amplification of the myc, ccndl, and erbBl proto-oncogenes 
should have clinical relevance in breast cancer, since independent 
studies have shown that these alterations can be used to identify 
sub-populations with a worse prognosis (Bems et ai, 1992* 
Schuuring et al, 1992; Slamon et ai, 1987). Muss et al, (1994) 
suggested that these gene alterations may also be usefuj for the 
prediction and assessment of the efficacy of adjuvant chemotherapy 
and hormone therapy. 

However, published results diverge both in terms of the fre- 
quency of these alterations and their clinical value. For instance,/ 
over 500 studies in 10 years have failed to resolve the controversy 



surrounding the link suggested by Slamon et al. (1987) between 
erbBl amplification and disease progression. These discrepancies 
are partly due to the clinical, histological and ethnic heterogeneity 
of breast cancer, but technical considerations are also probably 
involved. 

Specific genes (DNA) were initially quantified in Tumor cells by 
means of blotting procedures such as Southern and slot blotting. 
These batch techniques require large amounts of DNA (5-10 
ug/reaction) to yield reliable quantitative results. Furthermore, 
meticulous care is required at all stages of the procedures to 
generate blots of sufficient quality for reliable dosage analysis. 
Recently, PCR has proven to be a powerful tool for quantitative 
DNA analysis, especially with minimal starting quantities of tumor 
samples (small, early-stage tumors and formalin-fixed, paraffin- 
embedded tissues). 

Quantitative PCR can be performed by evaluating the amount of 
product either after a given number of cycles (end-point quantita- 
tive PCR) or after a varying- number of cycles during the 
exponential phase (kinetic quantitative PCR). In the first case, an 
internal standard distinct from the target molecule is required to 
ascertain PCR efficiency. The method is relatively easy but implies 
generating, quantifying and storing an internal standard for each 
gene studied. Nevertheless, it is the most frequendy applied 
method to date. 

One of the major advantages of the kinetic method is its rapidity 
in quantifying a new gene, since no internal standard is required (an 
external standard curve is sufficient). Moreover, the kinetic method 
has a wide dynamic range (at least 5 orders of magnitude), giving 
an accurate value for samples differing in their copy number. 
Unfortunately, the method is cumbersome and has therefore been 
rarely used. It involves aliquot sampling of each assay mix at 
regular intervals and quantifying, for each aliquot, the amplifica- 
tion product. Interest in the kinetic method has been stimulated by a 
novel approach using fluorescent TaqMan methodology and a new 
instrument (ABI Prism 7700 Sequence Detection System) capable 
of measuring fluorescence in real time (Gibson et ai, 1996; Heid et 
al, 1996). The TaqMan reaction is based on the 5' nuclease assay 
first described by Holland et al (1991). The latter uses the 5' 
nuclease activity of Taq polymerase to cleave a specific fluorogenic 
oligonucleotide probe during the extension phase of PCR. The 
approach uses dual-labeled fluorogenic hybridization probes (Lee 
et al, 1993). One fluorescent dye, co-valently linked lo the 5' end 
of the oligonucleotide, serves as a reporter [FAM (i.e., 6-carboxy- 
fluorescein)] and its emission spectrum is quenched by a second 
fluorescent dye, TAMRA (/.e., 6-carboxy-tetramethyl-rhodamine) 
attached to the 3' end. During the extension phase of the PCR 
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cycle, the fluorescent hybridization probe is hydrolyzed by the 
5'-3' nucleolytic activity of DNA polymerase. Nuclease degrada- 
tion of the probe releases the quenching of FAM fluorescence 
emission, resulting in an increase in peak fluorescence emission. 
The fluorescence signal is normalized by dividing the emission 
intensity of the reporter dye (FAM) by the emission intensity of a 
reference dye (i.e., ROX, 6-carboxy-X-rhodamine) included in 
TaqMan buffer, to obtain a ratio defined as the Rn (normalized 
reporter) for a given reaction tube. The use of a sequence detector 
enables the fluorescence spectra of all 96 wells of the thermal 
cycler to be measured continuously during PCR amplification. 

The real-time PCR method offers several advantages over other 
current quantitative PCR methods (Celi et al, 1994): (i) the 
probe-based homogeneous assay provides a real-time method for 
detecting only specific amplification products, since specific hybri- 
dation of both the primers and the probe is necessary to generate a 
signal; fii) the Q (threshold cycle) value used for quantification is 
measured when PCR amplification is still in the log phase of PCR 
product accumulation. This is the main reason why C, is a more 
reliable measure of the starting copy number than are end-point 
measurements, in which a slight difference in a limiting component 
can have a drastic effect on the amount of product; (in) use of C, 
values gives a wider dynamic range (at least 5 orders of magni- 
tude), reducing the need for serial dilution; (iv) The real-time PCR 
method is run in a closed-tube system and requires no post-PCR 
sample handling, thus avoiding potential contamination; (v) the 
system is highly automated, since the instrument continuously 
measures fluorescence in all 96 wells of the thermal cycler during 
PCR amplification and the corresponding software processes, and 
analyzes the fluorescence data; (vi) the assay is rapid, as results are 
available just one minute after thermal cycling is complete; (vii) the 
sample throughput of the method is high, since 96 reactions can be 
analyzed in 2 hr. 

Here, we applied this semi-automated procedure to determine 
the copy numbers of the 3 most frequently amplified genes in breast 
tumors (myc, ccndl and erbB2\ as well as 2 genes (alb and app) 
located in a chromosome region in which no genetic changes have 
been observed in breast tumors. The results for 108 breast tumors 
were compared with previous Southern-blot data for the same 
samples. 



MATERIAL AND METHODS 
Tumor and blood samples 

Samples were obtained from 1 08 primary breast tumors removed 
surgically from patients at the Centre Rene Huguenin; none of the 
patients had undergone radiotherapy or chemotherapy. Immedi- 
ately after surgery, the tumor samples were placed in liquid 
nitrogen until extraction of high-molecular-weight DNA. Patients 
were included in this study, if the tumor sample used for DNA 
preparation contained more than 60% of tumor cells (histological 
analysis). A blood sample was also taken from 18 of the same 
patients. 

DNA was extracted from tumor tissue and blood leukocytes 
according to standard methods. 

Real-time PCR 

Theoretical basis. Reactions are characterized by the point 
during cycling when amplification of the PCR product is first 
detected, rather than by the amount of PCR product accumulated 
after a fixed number of cycles. The higher the starting copy number 
of the genomic DNA target, the earlier a significant increase in 
fluorescence is observed. The parameter C, (threshold cycle) is 
defined as the fractional cycle number at which the fluorescence 
generated by cleavage of the probe passes a fixed threshold above 
baseline. The target gene copy number in unknown samples is 
quantified by measuring Q and by using a standard curve to 
determine the starting copy number. The precise amount of 
genomic DNA (based on optical density) and its quality (i.e., lack 



of extensive degradation) are both difficult to assess. We therefore 
also quantified a control gene (alb) mapping to chromosome region 
4qll-ql3. in which no genetic alterations have been found in 
breast-tumor DNA by means of CGH (Kallioniemi et aL, 1 994). 

Thus, the ratio of the copy number of the target gene to the copy 
number of the alb gene normalizes the amount and quality of 
genomic DNA. The ratio defining the level of amplification is 
termed "N", and is determined as follows: ' 

^ ^ copy number of target gene (app, myc, ccndl, erbhl) 
copy number of reference gene (a/6) 

Primers, probes, reference human genomic DNA and PCR 
consumables. Primers and probes were chosen with the assistance 
of the computer programs Oligo 4.0 (National Biosciences, Ply. 
mouth, MN), EuGene (Daniben Systems, Cincinnati, OH) and Primer 
Express (Perkin-Elmer Applied Biosystems, Foster City, CA). 

Primers were purchased from DNAgency (Malvern, PA) and 
probes from Perkin-Elmer Applied Biosystems. 

Nucleotide sequences for the oligonucleotide hybridization 
probes and primers are available on request. 

The TaqMan PCR Core reagent kit, MicroAmp optical tubes, 
and MicroAmp caps were from Perkin-Elmer Applied Biosystems! 

Standard-curve construction. The kinetic method requires a 
standard curve. The latter was constructed with serial dilutions of 
specific PCR products, according to Piatak et aL (1993). In 
practice, each specific PCR product was obtained by amplifying 20 
rig of a standard human genomic DNA (Boehringer, Mannheim, 
Germany) with the same primer pairs as those used later for 
real-time quantitative PCR. The 5 PCR products were purified 
using MicroSpin S-400 HR columns (Pharmacia, Uppsala, Swe- 
den) electrophorezed through an acrylamide gel and stained with 
ethidium bromide to check their quality. The PCR products were 
then quantified spectrophotometrically and pooled, and serially 
diluted 1 0-fold in mouse genomic DNA (Clontech, Palo Alto, CA) 
at a constant concentration of 2 ng/ul. The standard curve used for 
real-time quantitative PCR was based on serial dilutions of the pool 
of PCR products ranging from 10" 7 (10 5 copies of each gene) to 
lO-io (i()2 copied jhis series of diluted PCR products was 
aliquoted and stored at - 80° C until use. 

The standard curve was validated by analyzing 2 known 
quantities of calibrator human genomic DNA (20 ng and 50 hg). 

PCR amplification. Amplification mixes (50 ul) contained the 
sample DNA (around 20 ng, around 6600 copies of disomic genes) 
10X TaqMan bufTer (5 ul), 200' uM dATP, dCTP, dGTP, and 400 
uM dUTP, 5 mM MgCl 2) 1.25 units of AmpliTaq Gold, 0.5 units of 
AmpErase uracil N-glycosylase (UNG), 200 nM each primer and 
1 00 nM probe. The thermal cycling conditions comprised 2 min at 
50°C and 10 min al 95°C. Thermal cycling consisted of 40 cycles at 
95°C for 15 s and 65°C for 1 min. Each assay included: a standard 
curve (from 10 5 to 10 2 copies) in duplicate, a no-template control, 
20 ng and 50 ng of calibrator human genomic DNA (Boehringer) iri 
triplicate, and about 20 ng of unknown genomic DNA in triplicate 
(26 samples can thus be analyzed on a 96- well microplate). All 
samples with a coefficient of variation (CV) higher than 10% were 
retested. 

All reactions were performed in the ABI Prism 7700 Sequence 
Detection System (Perkin-Elmer Applied Biosystems), which 
detects the signal from the fluorogenic probe during PCR. ' 

Equipment for real-time detection. The 7700 system has a 
built-in thermal cycler and a laser directed via fiber optical cables 
to each of the 96 sample wells. A charge-coupled-device (CDD) 
camera collects the emission from each sample and the data are 
analyzed automatically. The software accompanying the 7700 
system calculates Q and determines the starting copy number in the 
samples. 
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Determination of gene amplification. Gene amplification was 
calculated as described above. Only samples with an N value 
higher than 2 were considered to be amplified. 

RESULTS 

To validate the method, real-time PCR was performed on 
genomic DNA extracted from 108 primary breast tumors, and 18 
normal leukocyte DNA samples from some of the same patients. 
The target genes were the myc, ccndl and er6B2 proto-oncogenes, 
and the JJ-amyloid precursor protein gene (app\ which maps to a 
chromosome region (21q21.2) in which no genetic alterations have 
been found in breast tumors (Kallioniemi et aL, 1994). The 
reference disomic gene was the albumin gene (alb, chromosome 
4qll-ql3). 



yalidation of the standard curve and dynamic range 
of real-time PCR 

The standard curve was constructed from PCR products serially 
diluted in genomic mouse DNA at a constant concentration of 
2 ng/ul It should be noted that the 5 primer pairs chosen to analyze 
the 5 target genes do not amplify genomic mouse DNA (data* not 
shown). Figure I shows the real-time PCR standard curve for the 
alb gene. The dynamic range was wide (at least 4 orders of 
magnitude), with samples containing as few as 10 2 copies or as 
many as I0 5 copies. 

Copy-number ratio of the 2 reference genes (app and alty 

The app to alb copy-number ratio was determined in 18 normal 
leukocyte DNA samples and all 108 primary breast-tumor DNA 
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Figure 1 - Albumin (alb) gene dosage by real-time PCR. Top: Amplification plots for reactions with starting alb gene copy number ranging 
from 10 s (A9), 10 4 (A7), 10 3 (A4) to 10 2 (A2) and a no-template control (A1). Cycle number is plotted vs. change in normalized reporter signal 
(ARn). For each reaction tube, the fluorescence signal of the reporter dye (FAM) is divided by the fluorescence signal of the passive reference dye 
(ROX), to obtain a ratio defined as the normalized reporter signal (Rn). ARn represents the normalized reporter signal (Rn) minus the baseline 
signal established in the first 15 PCR cycles. ARn increases during PCR as alb PCR product copy number increases until the reaction reaches a 
plateau. Q (threshold cycle) represents the fractional cycle number at which a significant increase in Rn above a baseline signal (horizontal black 
line) can first be detected. Two replicate plots were performed for each standard sample, but the data for only one are shown here Bottom- 
Standard curve plotting log starting copy number vs. C, (threshold cycle). The black dots represent the data for standard samples plotted in 
duplicate and the red dots the data for unknown genomic DNA samples plotted in triplicate. The standard curve shows 4 orders of linear dynamic 
range. 
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samples. We selected these 2 genes because they are located in 2 
chromosome regions (app, 21q21.2; alb, 4qll-ql3) in which no 
obvious genetic changes (including gains or losses) have been 
observed in breast cancers (Kallioniemi et al, 1 994). The ratio for 
the 18 normal leukocyte DNA samples fell between 0.7 and 1.3 
(mean 1.02 ± 0.21), and was similar for the 108 primary breast- 
tumor DNA samples (0.6 to 1.6, mean 1.06 ± 0.25), confirming 
that alb and app are appropriate reference disomic genes for 
breast-tumor DNA. The low range of the ratios also confirmed that 
the nucleotide sequences chosen for the primers and probes were 
not polymorphic, as mismatches of their primers or probes with the 
subject's DNA would have resulted in differential amplification. 

myc, ccndl arid erb£2 gene dose in normal leukocyte DNA 

To determine the cut-off point for gene amplification in breast- 
cancer tissue, 18 normal leukocyte DNA samples were tested for 
the gene dose (N), calculated as described in "Material and 
Methods 11 . The N value of these samples ranged from 0.5 to 1.3 
(mean 0.84 i 0.22) for myc; 0.7 to 1.6 (mean 1.06 ± 0.23) for 
ccndl and 0.6 to 1.3 (mean 0.91 ±6.19) forer£B2. Since N values 
for myc, ccndl and erbBl in normal leukocyte DNA consistently 
fell between 0.5 and 1 .6, values of 2 or more were considered to 
represent gene amplification in tumor DNA. 

myc, ccndl and crbB2 gene dose in breast-tumor DNA 

myc, ccndl and erbBl gene copy numbers in the 108 primary 
breast tumors are reported in Table 1. Extra copies of ccndl were 
more frequent (23%, 25/108) than extra copies of erbBl (15%, 
16/108) and myc (10%, 11/108), and ranged from 2 to 18.6 for 
ccndl, 2 to 15.1 for erbBl, and only 2 to 4.6 for the myc gene. 
Figure 2 and Table II represent tumors in which the ccndl gene was 
amplified 16-fold (T145), 6-fold (T133) and non-amplified (TU8). 
The 3 genes were never found to be co-amplified in the same tumor. 
erbBl and ccndl were co-amplified in only 3 cases, myc and ccndl 
in -2 cases and myc and erbBl in 1 case. This favors the hypothesis 
that gene amplifications are independent events in breast cancer. 
Interestingly, 5 tumors showed a decrease of at least 50% in the 
erbBl copy number (N < 0.5), suggesting that they bore deletions 
of the 17q21 region (the site of erbBl). No such decrease in copy 
number was observed with the other 2 proto-oncogenes. 

. Comparison of gene dose determined by real-time quantitative 
PCR and Southern-blot analysis 

Southern-blot analysis of myc, ccndl and erbBl amplifications 
had previously been done on the same 1 08 primary breast tumors. A 
perfect correlation between the results of real-time PCR and 
Southern blot was obtained for rumors with high copy numbers 
(N ^ 5). However, there were cases (1 myc, 6 ccndl and 4 er£B2) 
in which real-time PCR showed gene amplification whereas 
Southem-blot did not, but these were mainly cases with low extra 
copy numbers (N from 2 to 2.9). 

DISCUSSION 

The clinical applications of gene amplification assays are 
currently limited, but would certainly increase if a simple, standard- 
ized and rapid method were perfected. Gene amplification status 
has been studied mainly by means of Southern blotting, but this 
method is not sensitive enough to detect low-level gene amplifica- 
tion nor accurate enough to quantify the full range of amplification 
values. Southern blotting is also time-consuming, uses radioactive 



TABLE I - DISTRIBUTION OF AMPLIFICATION LEVEL (N) FOR myc 
ccndl AND erbBl GENES IN 108 HUMAN BREAST TUMORS 



Gene 




Amplification level (N) 






<0.5 


0.5-1.9 2-4.9 




myc 

ccndl 

erbB2 


0 * 
0 

5 (4.6%) 


97 (89.8%) 11 (10.2%) 
83 (76.9%) 17(15.7%) 
87 (80.6%) 8 (7.4%) 


0 

8 (7.4%) 
8 (7.4%) 



reagents and requires relatively large amounts of high-quality 
genomic DNA, which means it cannot be used routinely in many 
laboratories. An amplification step is therefore required to deter- 
mine the copy number of a given target gene from minimal 
quantities of tumor DNA (small early-stage tumors, cytopuncture 
specimens or formalin-fixed, paraffin-embedded tissues). 

In this study, we validated a PCR method developed for the 
quantification of gene oVer-rcpresentation in rumors. The method, 
based on real-time analysis of PCR amplification, has several 
advantages over other PCR-based quantitative assays such as 
competitive quantitative PCR (Celi et al, 1 994). First, the real-time 
PCR method is performed in a closed-tube system, avoiding the 
risk of contamination by amplified products. Re-amplification of 
carryover PCR products in subsequent experiments can also be 
prevented by using the enzyme uracil N-glycosylase (UNG) 
(Longo et al, 1990). The second advantage is the simplicity and 
rapidity of sample analysis, since no post-PCR manipulations are 
required. Our results show that the automated method is reliable. 
We found it possible to determine, in triplicate, the number of 
copies of a target gene in more than 100 tumors per day. Third, the 
system has a linear dynamic range of at least 4 orders of magnitude, 
meaning that samples do not have to contain equal starting amounts 
of DNA. This technique should therefore be suitable for analyzing 
formalin-fixed, paraffin-embedded tissues. Fourth, and above all, 
real-time PCR makes L>NA quantification much more precise and 
reproducible, since it is based on C, values rather than end-point 
measurement of the amount of accumulated PCR product. Indeed, 
the ABI Prism 7700 Sequence Detection System enables d to be 
calculated when PCR amplification is still in the exponential phase 
and when none of the reaction components is rate-limiting. The 
within-run CV of the Q value for calibrator human DNA (5 
replicates) was always below 5%, and the between-assay precision 
in 5 different runs was always below 10% (data not shown). In 
addition, the use of a standard curve is not absolutely necessary, 
since the copy number can be determined simply by comparing the 
Q ratio of the target gene with that of reference genes. The results 
obtained by the 2 methods (with and without a standard curve) are 
similar in our experiments (data not shown). Moreover, unlike 
competitive quantitative PCR, real-time PCR does not require an 
internal control (the design and storage of internal controls and the 
validation of their amplification efficiency is laborious). 

The only potential disavantage of real-time PCR, like all other 
PCR-based methods and solid-matrix blotting techniques (South- 
ern blots and dot blots) is that is cannot avoid dilution artifacts 
inherent in the extraction of DNA from tumor cells contained in 
heterogeneous tissue specimens. Only FISH and immunohistochem- 
istry can measure alterations on a cell-by-cell basis (Pauletti et al, 
1996; Slamon et al., 1989). However, FISH requires expensive 
equipment and trained personnel and is also time-consuming. 
Moreover, FISH does not assess gene expression and therefore 
cannot detect cases in which the gene product is over-expressed in 
the absence of gene amplification, which will be possible in the 
future by real-time quantitative RT-PCR. Immunohistochemistry is 
subject to considerable variations in the hands of different teams 
owing to alterations of target proteins during the procedure, the 
different primary antibodies and fixation methods used and the 
criteria used to define positive staining. 

The results of this study are in agreement with those reported in 
the literature, (*) Chromosome regions 4qll-ql3 and 21q21.2 
(which bear alb and app, respectively) showed no genetic alter- 
ations in the breast-cancer samples studied here, in keeping with 
the results of CGH (Kallioniemi et al, 1994). (ii) We found that 
amplifications of these 3 oncogenes were independent events,- as 
reported by other teams (Berns et al, 1992; Borg et al, 1992). [iii) 
The frequency and degree of myc amplification in our breast tumor 
DNA series were lower than those of ccndl and erbBl amplifica- 
tion, confirming the findings of Borg et al (1 992) and Courjal et al 
(1997). (z'v) The maxima of ccndl and erbBl over-representation 
were 1 8-fold and 1 5-fold, also in keeping with earlier results (about 
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T118 
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23.2 
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25.2 



10092 



T145 



22.1 
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25.6 
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Figure 2 -ccnd) and alb gene dosage by real-time PCR in 3 breast tumor samples: TU 8 (E 1 2, C6 black sauares^ T1 33 fGll R4 x 
andTH5(A8,C8 ; blue squares). Given^ 

experiment Tnpl.cate plots were performed for each tumor sample, but the data for only one are shown here. The results are shown in Table IL^ 



30-fold maximum) (Bems et aL, 1992; Borg et ai, 1 992; Courjal et 
ai, 1997). (v) The erbB2 copy numbers obtained with real-time 
PCR were in good agreement with data obtained with other 
quantitative PCR-based assays in terms of the frequency and 
degree of amplification (An et al., 1995; Deng et ai t 1996; Valeron 



et al., 1996). Our results also correlate well with those recently 
published by Gelmini et ai ( 1 997), who used the TaqMan system to 
measure er6B2 amplification in a small series of breast tumors 
(n = 25), but with an instrument (LS-50B luminescence spectrom- 
eter, Perkin-Elmer Applied Biosystems) which only allows end- 
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TABLE II - EXAMPLES OF ccndl GENE dosage results 
FROM 3 BREAST TUMORS' 



Tumor 




ccndl 






alb 




Hccndl/alb 


Copy 
number 


Mean 


SD 


Copy 
number 


Mean 


SD 


Til 8 


4525 






4223 










4605 


4603 


77 


4365 


4325 


89 


1.06 




4678 






4387 






T133 


59821 






9787 










61659 


61100 


1111 


10092 


10137 


375 


6.03 . 




61821 






10533 






T145 


128563 






7321 










125892 


125392 


3448 


7762 


7672 


316 


16.34 




121722 






7933 









1 For each sample, 3 replicate experiments were performed and the mean 
and the standard deviation (SD) was determined. The level of ccndl gene 
amplification {tAccndlfalb) is determined by dividing the average ccndl 
copy number vaJ uc by the average alb copy number value. 



point measurement of fluorescence intensity. Here we report myc 
and ccndl gene dosage in breast cancer by means of quantitative 
PCR (vi) We found a high degree of concordance between 
real-time quantitative PCR and Southern blot analysis in terms of 
gene amplification, especially for samples with high copy numbers 
(>5-fold). The slightly higher frequency of gene amplification 
(especially ccndl and erbBl) observed by means of real-time 
quantitative PCR as compared with Southem-blot analysis may be 
explained by the higher sensitivity of the former method. However, 
we cannot rule out the possibility that some tumors with a few extra 



gene copies observed in real-time PCR had additional copies of an 
arm or a whole chromosome (trisomy, tetrasomy or polysomy) 
rather than true gene amplification. . These 2 types of genetic 
alteration (polysomy and gene amplification) could be easily 
distinguished in the future by using an additional probe located on 
the same chromosome arm, but some distance from the target gene. 
It is noteworthy that high gene copy numbers have the greatest 
prognostic significance in breast carcinoma (Borg et al 1992* 
Slamonera/., 1987). 

Finally, this technique can be applied to the detection of gene 
deletion as well as gene amplification. Indeed, we found a 
decreased copy number of erb&2 (but not of the other 2 proto- 
oncogenes) in several tumors; erbEl is located in a chromosome 
region (17q21) reported to contain both deletions and amplifica- 
tions Fn breast cancer (Bieche and Lidereau, 1995). 

In conclusion, gene amplification in various cancers can be used 
as a marker of pre-neoplasia, also for early diagnosis of cancer, 
staging, prognostication and choice of treatment Southern blotting 
is not sufficiently sensitive, and FISH is lengthy and complex. 
Real-time quantitative PCR overcomes both these limitations, and 
is a sensitive and accurate method of analyzing large numbers of 
samples in a short time. It should find a place, in routine clinical 
gene dosage. 
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50. Ashkenazi, A ., Pai, R., Fong, s., Leung, S., Lawrence, D., Marsters, S., Blackie, 
C, Chang, L., McMurtrey, A., Hebert, A., DeForge, L., Khoumenis, L, Lewis, D., 
Harris, L., Bussiere, J., Kpeppen, EL, Shahrokh, Z., and Schwall, R. Safety and 
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Ashkenazi, A. Apo2L/TRAIL-dependent recruitment of endogenous FADD and 
Caspase-8 to death receptors 4 and 5. Immunity 12, 61 1-620 (2000). 
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Review articles: 

1 . Ashkenazi, A., Peralta, R, Winslow y J., Ramachandran, J., and Capon, D., J. 
Functional role of muscarinic acetylcholine receptor subtype diversity. Cold 
Spring Harbor Symposium on Quantitative Biology. LIII, 263-272 (1988). 

2. Ashkenazi, A ., Peralta, E., Winslow, J., Ramachandran, J., and Caipon, D. 
Functional diversity of muscarinic receptor subtypes in cellular signal 
transduction and growth. Trends Pharmacol Set Dec Supplement, 12-21 (1989). 

3. Chamow, S., Duliege, A., Ammann, A., Kahn, L, Allen, D., Eichberg, J., Byrn, 
R., Capon, D., Ward, R., and Ashkenazi, A . CD4 immunoadhesins in anti-HIV 
therapy: new developments. Int. J. Cancer Supplement 7, 69-72 (1992). 

4. Ashkenazi, A ., Capon, and D. Ward, R. Immunoadhesins. Int Rev. Immunol. 10, 
217-225 (1993). 

5. Ashkenazi, A ., and Peralta, E. Muscarinic Receptors: In Handbook of Receptors 
and Channels. (S. Peroutka, ed.), CRC Press, Boca Raton, Vol. I, p. 1-27, (1994). 

6. Krantz, S. B., Means, R. T., Jr., Lina, J., Marsters, S. A, and Ashkenazi, A . 
Inhibition of erythroid colony formati on in vitro by gamma interferon. In 
Molecular Biology of Hematopoiesis (N. Abraham, R. Shadduck, A. Levine F. 
Takaku, eds.) Intercept Ltd. Paris, Vol. 3, p. 135-147 (1994), 

7. Ashkenazi, A. Cytokine neutralization as a potential therapeutic approach for 
SIRS and shock. J. Biotechnology in Healthcare 1, 197-206 (1994). 

8. Ashkenazi, A .y and Chamow, S. M. Immimoadhesins: an alternative to human 
monoclonal antibodies. Imrhunomethods: A companion to Methods in 
Enzimology 8, 104-115 (1995). 

9. Chamow, S., and Ashkenazi, A . Immunoadhesins: Principles and Applications. 
Trends Biotech. 14, 52-60 (1 996). 

10. Ashkenazi; A .', and Chamow, S. M. Immunoadhesins as research tools and 
therapeutic agents. Curr. Opin. Immunol 9, 195-200 (1997). 

11. Ashkenazi, A ., and Dixit, V. Death receptors: signaling and modulation. Science 
281, 1305-1308 (1998). * 

12. Ashkenazi, A ., and Dixit, V. Apoptosis control by death and decoy receptors. 
Curr. Opin. Cell Biol 11, 255-260 (1999). 
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1 3. AshkenazL A . Chapters on Apo2L/TRAIL; DR4, DR5, DcRl, DcR2; and DcR3. 
Online Cytokine Handbook (www.apnet.com/cvtokinereference/) . 

14. Ashkenazi, A . Targeting death and decoy receptors of the tumor necrosis factor 
superfamily. Nature Rev. Cancer 2, 420-430 (2002). 

1 5. LeBlanc, H. and Ashkenazi A . Apoptosis signaling by Apo2L/TRAIL. Cell Death 
and Differentiation 10, 66^75(2003). 

1 6. Almasan, A. and Ashkenazi, A . Apo2L/TRAIL: apoptosis signaling, biology, and 
potential for cancer therapy. Cytokine and Growth Factor Reviews 14, 337-348 
(2003). . . 

Book: 

Antibody Fusion Proteins (Chamow, S., and Ashkenazi A ., eds., John Wiley and 
Sons Inc.) (1999). 
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1 . Resistance of primary HIV isolates to CD4 is independent of CD4-gpl20 binding 
affinity. UCSD Symposium, HIV Disease: Pathogenesis and Therapy. 
Greenelefe, FL, March 1991. 

2. Use of immuno-hybrids to extend the half-life of receptors. D3C conference on 
Biopharmaceutical Halflife Extension. New Orleans, LA, June 1992. 

3 . Results with TNF receptor Immunoadhesins for the Treatment of Sepsis. EBC 
conference on Endo toxemia and Sepsis. Philadelphia, PA, June 1992. 

4. Immunoadhesins: an alternative to human antibodies. IBC conference on 
Antibody Engineering. San Diego, CA,. December 1993. . 

5 . Tumor necrosis factor receptor: a potential therapeutic for human septic shock. 
American Society for Microbiology Meeting, Atlanta, GA, May 1993, 

6. Protective efficiacy of TNF receptor iminunoadhesin vs anti-TNF monoclonal 
antibody in a rat model for endotoxic shock. 5th International Congress on TNF. 
Asilomar, CA, May 1994. 

7. Interferon-y signals via a multisubunit receptor complex that contains two types. of 
polypeptide chain. American Association of hnmunolo gists Conference. San 
Franciso, CA, July 1995. 

8. Immunoadhesins: Principles and Applications. Gordon Research Conference on 
Drug Delivery in Biology and Medicine. Ventura, CA, February 1996. 
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9 . Apo-2 Ligand, a new member of the TNF family that induces apoptosis in tumor 
cells. Cambridge Symposium on TNF and Related Cytokines in Treatment of 
Cancer. Hilton-Head, NC, March 1996. 

1 0. Induction of apoptosis by Apo2 Ligand. American Society for Biochemistry and 
Molecular Biology, Symposium on Growth Factors and Cytokine Receptors. New 
Orleans, LA, June, 1996. 

1 1 . Apo2 ligand, an extracellular trigger of apoptosis. 2nd Clontech Symposium, 
Palo Alto, CA, October 1996. 

12. Regulation of apoptosis by members of the TNF ligand and receptor families. . 
Stanford University School of Medicine, Palo Alto, CA, December 1996. 

13. Apo-3: ahovel receptor that regulates cell death and inflammation. 4th 
International Congress on Immune Consequences of Trauma, Shock, and Sepsis. 
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14. New members of the TNF ligand and receptor families that regulate apoptosis, 
inflammation, and immunity. UCLA School of Medicine, LA, C A, March 1997. 
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on Bispecific Antibodies. Volendam, Holland, June 1997. 

1 6 . Control of Apo2L signaling. Cold Spring Harbor Laboratory Symposium on 
Programmed Cell Death. Cold Spring Harbor, New York. September, 1997. 
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Conference on Apoptosis. San Diego, CA., October 1997. 

1 8 . Control of Apo2L signaling by death and decoy receptors. American Association 
for the Advancement of Science. Philadelphia, PA, February 1998. 

19. Apo2 ligand and its receptors. American Society of Innnunologists. San 
Francisco, CA, April 1998. 

20. Death receptors and ligands. 7th International TNF Congress. Cape Cod, MA, 
May 1998. 

21. Apo2L as a potential therapeutic for cancer. UCLA School of Medicine. LA, 
. CA, June 1998. 

22. Apo2L as a potential therapeutic for cancer. Gordon Research Conference on 
Cancer Chemotherapy. New London, NH, July 1998. . 

23 : Control of apoptosis by Apo2L. Endocrine Society Conference, Stevenson, WA, 
August 1998. 

24. Control of apoptosis by Apo2L. International Cytokine Society Conference, 
Jerusalem, Israel, October 1998. 
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25. Apoptosis control by death and decoy receptors. American Association for 
Cancer Research Conference, Whistler, BC, Canada, March 1 999. 

26. Apoptosis control by death and decoy receptors. American Society for 
Biochemistry and Molecular Biology Conference, San Francisco, CA, May 1999. 

27. Apoptosis control by death and decoy receptors. Gordon Research Conference on 
Apoptosis, New London, NH, June 1999. 

28. Apoptosis control by death and decoy receptors. Arthritis Foundation Research . 
Conference, Alexandria GA, Aug 1999. 

29. Safety and anti-tumor activity of recombinant soluble Apo2L/TRAIL. Cold 
Spring Harbor Laboratory Symposium on Programmed Cell Death. . Cold Spring 
Harbor, NY, September 1999. 

30. The Apo2L/TRAIL system: therapeutic potential. American Association for 
Cancer Research, Lake Tahoe, NV, Feb 2000. 

3 1 . Apoptosis and cancer therapy. Stanford University School of Medicine, Stanford, 
CA, Mar 2000. 

32. Apoptosis and cancer therapy. University of Pennsylvania School of Medicine, . 
Philadelphia, PA, Apr 2000. 

3 3 . Apoptosis signaling by Apo2L/TRAIL. International Congress on TNF. 

Trondheim, Norway, May 2000. 
34. The Apo2L/TRAJL system: therapeutic potential Cap-CURE summit meeting. 

Santa Monica, CA, June 2000.. . 
3 5 . The Apo2L/TRAIL system: therapeutic potential. MD Anderson Cancer Center. 

Houston, TX, June 2000. 

36. Apoptosis signaling by Apo2L/TRAIL. The Protein Society, 14 th Symposium. 
San Diego, CA, August 2000. 

37. Anti-tumor activity of Apo2L/TRAIL. AAPS annual meeting. Indianapolis, IN 
Aug 2000. . 

3 8 . Apoptosis signaling, and anti-cancer potential of Apo2L/TRAIL. Cancer Research 
Institute, UC San Francisco, CA, September 2000. 

39. Apoptosis signaling by Apo2L/TRAIL. Kenote address, TNF family 
Minisymposium, NHL Bethesda, MD, September 2000. 

40. Death receptors: signaling and modulation. Keystone symposium on the 
Molecular basis of cancer. Taos, NM, Jan 2001 . 

41 : . Preclinical studies of Apo2L/TRAIL in cancer. Symposium on Targeted therapies 
in the treatment of lung cancer. Aspen, CO, Jan 2001. 
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42. Apoptosis signaling by Apo2L/TRAIL. Wiezraann Institute of Science, Rehovot, 
Israel, March 2001. 

43. Apo2L/TRAJL: Apoptosis signaling and potential for cancer therapy. Weizmann 
Institute of Science, Rehovot, Israel, March 2001 . 

44. Targeting death receptors in cancer with Apo2L/TRAIL. Cell Death and Disease 
conference, North Falmouth, MA, Jun 2001. 

45. Targeting death receptors in cancer with Apo2IVTRAIL. Biotechnology 
Organization conference, San Diego, CA, Jun 2001. 

46. Apo2L/TRAIL signaling and apoptosis resistance mechanisms. Gordon Research 
Conference on Apoptosis, Oxford, UK, July 2001. 

47. Apo2L/TRAIL signaling and apoptosis resistance mechanisms. Cleveland Clinic 
Foundation, Cleveland, OH, Oct 2001. 

48 . Apoptosis signaling by death receptors: overview. International Society for 
Interferon and Cytokine Research conference, Cleveland, OH, Oct 2001. 

49 . 4 Apoptosis signaling by death receptors. American Society of Nephrology 

Conference. San Francisco, CA, Oct 2001. 

50. Targeting death receptors in cancer. Apoptosis: commercial opportunities. San 
Diego, CA, Apr 2002. • * - 

5 1 - Apo2L/TRAIL signaling and apoptosis resistance mechanisms. Kimmel Cancer 

Research Center, Johns Hopkins University, Baltimore MD. May 2002. 
52 . Apoptosis control by Apo2L/TRAIL. (Keynote Address) University of Alabama 

Cancer Center Retreat, Birmingham, Ab. October 2002. 
53 • ' Apoptosis signaling by Apo2L/TRAJL. (Session co-chair) TNP international. 

conference. San Diego, CA. October 2002. 
54-. Apoptosis signaling by Apo2L/TRAIL. Swiss Institute for Cancer Research 

(ISREC). Lausanne, Swizerland. Jari 2003 . 
5 5 - Apoptosis induction with Apo2L/TRAJL. Conference on New Targets and 

. Innovative Strategies in Cancer Treatment. Monte Carlo. February 2003. 
56 . Apoptosis signaling by Apo2L/TRAIL. Hennelin Brain Tumor Center - 

Symposium on Apoptosis. Detroit, MI. April 2003. 
57- Targeting apoptosis through death reqeptors. Sixth Annual Conference on 

Targeted Therapies in the Treatment of Breast Cancer. Kdna, Hawaii. July 2003. 
58, Targeting apoptosis through death receptors. Second International Conference on 

Targeted Cancer Therapy. Washington, DC. Aug 2003. 
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reagents. US patent 6,582,928B1 (Jun 24, 2003). 



14 



DECLARATION OF PAUL POLAKIS, Ph.D. 
I, Paul Polakis, Ph.D., declare and say as follows: 

1 . I was awarded a Ph.D. by the Department of Biochemistry of the Michigan 
State University in 1984. My scientific Curriculum Vitae is attached to and forms 
part of this Declaration (Exhibit A). 

2 I am currently employed by Genentech, Inc. where my job title is Staff 
Scientist. Since joining Genentech in 1999, one of my primary responsibilities has 
been leading Genenteeh's Tumor Antigen Project, which is a large research project 
with a primary focus on identifying tumor cell markers that find use as targets for 
both i the diagnosis and treatment of cancer in humans. 

3. As part of the Tumor Antigen Project, my laboratory has been analyzing 
differential expression of various genes in tumor cells relative to normal cells; 
The purpose of this research is to identify proteins that are abundantly expressed 
on certain tumor cells and that are either (i) not expressed, or (ii) expressed at 
lower levels, on corresponding normal cells. We call such differentially expressed 
proteins "tumor antigen proteins", When such a tumor antigen protein is 
identified, one can produce an antibody that recognizes and binds to that protein. 
Such an antibody finds use in the diagnosis of human cancer and may ultimately 
serve as an effective therapeutic in the treatment of human cancer. 

4. m me course of the research conducted by Genenteeh's Tumor Antigen 
Project, we have employed a variety of scientific techniques for detecting and 
studying differential gene expression in human tumor cells relative to normal cells, 
at genomic DNA, mRNA and protein levels. An important example of one such 
technique is the well known and widely used technique of microarray analysis 
which has proven to be extremely useful for the identification of mRNA molecules 
that are differentially expressed in one tissue or cell type relative to another. In the 
course of our research using microarray analysis, we have identified 
approximately 200 gene transcripts that are present in human tumor cells at 
significantly iiigher levels-margin ^offespondmg^oimal human-eells^ro-datej we- 
have generated antibodies that bind to about 30 of the tumor antigen proteins 
expressed from these differentially expressed gene transcripts and have used these 
antibodies to quantitatively determine the level of production of these tumor 
antigen proteins in both human cancer cells and corresponding normal cells. We 
have then compared the levels of mRNA and protein in both the tumor and normal 
cells analyzed. 

5. From the mRNA and protein expression analyses described in paragraph 4 
above, we have observed that there is a strong correlation between changes in the 
level of mRNA present in any particular cell type and the level of protein 



expressed from that mRNA in that cell type. In approximately 80% of our 
observations we have found that increases in the level of a particular mRNA 
correlates with changes in the level of protein expressed from that mRNA when 
human tumor cells are compared with their corresponding normal cells. 

6. Based upon my own experience accumulated in more than 20 years of 
research, including the data discussed in paragraphs 4 and 5 above and my 
knowledge of the relevant scientific literature, it is my considered scientific 
opinion that for human genes, an increased level of mRNA in a tumor cell relative 
to a normal cell typically correlates to a similar increase in abundance of the 
encoded protein in the tumor cell relative to the normal cell. In fact, it remains a 
central dogma in molecular biology that increased mRNA levels are predictive of 
corresponding increased levels of the encoded protein. While there have been 
published reports of genes for which such a correlation does not exist, it is my 
opinion that such reports are exceptions to the commonly understood general rule 
that increased mRNA levels are predictive of corresponding increased levels of the 
encoded protein. 

* - 

7. I hereby declare that all statements made herein of my own knowledge are 
true and that all statements made on information or belief are believed to be true, 
and further that these statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or imprisonment, or both, 
under Section 1001 of Title 18 of the United States Code and that such willful 
statements may j eopardize the validity of the application or any patent issued 
thereon. 
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of a 2 a dHi'° SS ° f chr ? moso "'a< material is characteristic 

inera^^i! a e n o er, " 38 ma,i 9^"t ^formation In 
^ cons equences of these changes at both the 
transcr.pt.on and translation levels Is at present untawwl 

ZSXZ 0 ^" teChnlCa ' " mitati0ns - " ere we haveT 
tempted.to address this question in pairs of non-invasive 
and ,nvas,ve human bladder tumors using a comSnSon 

izalfen ht°h h **■ l' nC,Uded com P- a «ve genomic nybrid- 
zation. h.gh densrty oligonucleotide array-based monitor. 



that the™ le - ' "-w" «»■•■* me results snowed 

that there ,s a gene dosage effect W in some cases 

<%2? P °TL ° n ° ther mechanisms Tnfe ef! 

feet depended (p < 0 .0l5Lpn the magnitude of tte com- 

of DNA showJT, r n,a ' areaS With more ^ fold 9«tn 
oi dna showed a corresponding increase in mRNA tran- 

222^22!" ^ ' OSS °' DNA ' on th * other hand 
showed e.ther reduced or unaltered transcript levelX 

are unknown it was only possible to compare mRNA and 
protein alterations in relatively few cases of we?) focused 
SZSS * P : 0teinS " few «"P«ons we found a gooS 

SSSEi^ •rr" tranSCrlPt a,te ^an2 

proxem ieve s . The implications, as well as limitations 

of the approach are discussed. Molecuir TSSE 
Proteomtcs 1:37-45, 2002. cellular 

m^SS *! a C ° mmon feature of most human can «rs 

(1). but Irttle a known about the genome-wide effect of this 



phenomenon at both the transcription and translation levels 
H.gh throughput array studies of the breast cancer cell line 
BT474 has suggested that there is a correlation between 
c ° p y "umbers and gene expression in highly amplified 
areas (2), and studies of individual genes in solid tumors 
have revealed a good correlation between gene dose and 
mRNA or protein levels in the case of c-erb-B2, cyclfn dl 
ems1. and N-myc (3-5). However, a high cyclin D1 protein 
express.on has been observed without simultaneous am- 
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crease was observed without concomitant c-myc protein 
overexpression (6). y P««ein 

In human bladder tumors, karyotyping, fluorescent In situ 
hybnd.zat.on, and comparative genomic hybridization (CGHV 
have revealed chromosomal aberrations that seem to be 
characteristic of certain stages of disease progression In the 
case of non-invasive pTa transitional cell carcinomas (TCCs) 
ths .ncludes loss of chromosome 9 or parts of It, as well as 
loss of Y in males. In minimally invasive pTI TCCs the foi 
lowing alterations have been reported: 2q- . 11d 1 „ 
11Q18+. I7q +1 and 20q + (7-12). It has been suggested thai 
these reg.ons harbor , tumor suppressor genes arid onco- 
genes; however, the large chromosomal areas involved often 
conta.n many genes, making meaningful predictions of the 
funchonai consequences of losses and gains very difficult 

in th.s investigation we have combined genome-wide tech- 
nology for detecting genomic gains and losses (CGH) wfth 
gene expression profiling, techniques (microarrays and pro- 
teomics) to determine the effect of gene copy number ™ 
transcript and protein levels in pairs of non-ln^e a^ in- 
vasive human bladder TCCs. 

EXPERIMENTAL PROCEDURES 
AfeteriaZ-Bladder tumor biopsies were sampled after intern,™, 
consent was obtained and after removal of tissue for roSnl l^, 

staged by an experienced pathologist as pTa (superfidal SpiS 
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Fkj , DMA cop, number and m RNA expression ,evel. f "™ ^tn^^^ 
expression .eve. "specie genes, and over*., expression ^eve. a^ ^^^TvalWeTmor 827 compared with the non-.nvas.ve 
compared with the non-invasive counterpart tumor 335. °- ^NA and norrnal DNA is shown along the length of the chromosome 
counterpart turr^ 532. The average fluorescent s lg nalratK, betwe^ curves indicating one standard 

£m The bold curve in the ratio profile represents a mean * //oes next to it (dotted) indicate a ratio of 

deviation The centra, vertfca/ «ne (broAen) h ^ es Orations in DNA "T 

OS (/eft) and 2.0 WO- In ^^T^w ofTe :^TnvZe Tof^rcTr^bar, represents one gene eachjdentified by the 
profile of that chromosome Is shown to the rgW of ^ mvasr vejumo The ^ indicate the purported locatonof 
running numbers above the bars (the name of the gene ca n tumor ^^ed wllh the non-invasive counterpart; >2-fold 

me oene and the cofors Indicate the express.on level of the gene m ^, e ^ o ^ e t X far rJ« entitled Expression shows the resulting change 
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grade I and II. respectively, tumors 733 and 827 were staged as pTI 
Invasive into submucosa). 733 was staged as sohd. and 827 was 

weTe embedd^immediateb- In a sodiunvg U anicIinl U mJh,ocyanate 
button Sd stored at -80 -C. Total RNA was ^ 
RNAmI B RNA isolation method (WAK-Chem.e Medical GMBH). 
RN Twas Elated, by an o.lgo(dT) section step (Oligotex 

*^£2»-1 ,g of rhRNA was used - ««£ 
The first andWnd strand cDNA synthesis 
Superscript® choice system Onvltrogen) accordmg Jo *emanu** 
turert instructions but using an oligo(dD pnmer centring T7 RNA 
X-se binding site. Labe.ed cRNAwas P*P"^*^ 
GAscrip® in vitro transcription kit (Ambon). B,ot.n-labeled CTP and 



UTP (Enzo) was used, together with unlabeled NTPs In the reaction. 
Following the in vitro transcription reaction, the unincorporated nu- 
cleotides were removed using RNeasy columns (Qiagen). 

Army Hybridization and Scanning- Array hybridization and scan- 
ning was modified from a previous method (13). 10 «j ofcRNAwas 
fragmented at 94 "C for 35 min in buffer containing 40 rrw Trls 
acetate dH 8 1. 100 m KOAc. 30 ihm MgOAc. Prior to hybridization. 
75£S£«Z* * a 6X SSPE-T nybndfcation ^fer (1 m NaO. 
10 mM Trls DH 7.6, 0.005% Triton), was heated to 95 *C for 5 mln, 
ubT^W coo.ed to 40 -C. and loaded onto 
array cartridge. The probe array was then incubated for 16 h at 40 C 
rconstant'rotation (60 rem). The P'^LT^^t^J 
washes in 6x SSPE-T at 25. °C followed by 4 "f™* 0 ***^ 
at 50 -C. The biotinylated cRNA was stained «* ^P^" 
phycoerythrin conjugate. 10 ^hn\ (Molecular Probes) in 6X SSPE-T 
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Fig. 1— continued 



for 30 min at 25 *C followed by 1 0 washes in 6 x SSPE-T at 25 °C. The 
probe arrays were scanned at 560 nm using a conf ocal laser scanning 
microscope (made for Asymetrix by Hewlett-Packard). The readings 
from the quantitative scanning were analyzed by Affymetrix gene 
expression analysis software. 

Microsateiiite Analysis— MicrosateJlite Analysis was performed as 
described, previously (14). MicrosatelUtes were selected by use of 
vww.nobl.nlm.rah.gov/genemap98, and primer sequences were ob- 
tained from the genome data base at www.gdb.org. DNA was extracted 
from tumor and Wood and amplified by PCR in a volume of 20 *d for 3$ 
cycles. The ampficons were denatured and electrophoresed for 3 h in an 
ABI Prism 377. Data were collected in the Gene Scan program for 
fragment analysis. Loss of heterozygosity was defined as less than 33% 
of one allele detected in tumor ampltcons compared with blood. 

Proteomic Analysis— TCCs were minced into small pieces and 
homogenized in a small glass homogenlzer in 0.5 ml of lysis solution. 
Samples were stored at -20 °C until use. The procedure for 2D gel 
electrophoresis has been described in detail elsewhere (15. 16). Gels 
were stained with silver nitrate and/or Coomassie Brilliant Blue. Pro- 
teins were Identified by a combination of procedures that Included 
microsequencing, mass spectrometry, two-dimensional gel Western 
imrnunoblotting, and comparison with the master two-dimensional gel 
image of human keratinocyte proteins; see biobase.dk/cgi-bin/celis. 

CGH— Hybridization of differentially labeled tumor and normal DNA 
to normal metaphase chromosomes was performed as described 
previously (10). Ruorescein-labeled tumor DNA (200 ng), Texas Red- 



labeled reference DNA (200 ng), and human Cot-1 DNA (20 tig) were 
denatured at 37 °C for 5 min and applied to denatured normal met- 
aphase slides. Hybridization was at 37 °C 1or 2 days. After washing, 
the slides were counterstained with 0.15 nQ/m\ 4,6-diamidino-2-phe- 
nylindole in an anti-fade solution. A second hybridization was per- 
formed for all tumor samples using fluoresceirt-labeled reference DNA 
and Texas Red -labeled tumor DNA (inverse labeling) to confirm the 
aberrations detected during the initial hybridization. Each CGH ex- 
periment also included a normal control hybridization using fluores- 
cein- and Texas Red-labeled normal DNA. Digital image analysis was 
used to identify chromosomal regions with abnormal fluorescence 
ratios, indicating regions of DNA gains and tosses. The average 
green;red fluorescence intensity ratio profiles were calculated using 
four images of each chromosome (eight chromosomes total) with 
normalization of the green:red fluorescence intensity ratio for the 
entire metaphase and background correction. Chromosome identifi- 
cation was performed based on 4,6-diamidino-2-pnenyltndole band- 
ing patterns. Only images showing uniform high Intensity fluores- 
cence with minima) background staining were analyzed. All 
centromeres, p arms of acrocentric chromosomes, and heterochro- 
matlc regions were excluded from the analysis. 

RESULTS 

Comparative Genomic Hybridization— The CGH analysis 
identified a number of chromosomal gains and losses in the 
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Table I 

Correlation between alterations detected by CGH and by expression monitoring 
Top, CGH used as independent variable (if CGH alteration - what expression ratio was found); bottom, altered expression used as 
independent variable (if expression alteration - what CGH deviation was found). 



CGH alterations 



13 Gain 



10 Loss 



Tumor 733 vs. 335 
Ex pression change clusters 

10 Up-regulation 

0 Down-regulation 

3 No change 

1 Up-regulation 

5 Down-regulation 

4 No change 



Concordance CGH alterations 



Tumor 827 vs. 532 



77% 



50% 



10 Gain 



12 Loss 



Expression change clusters 



Tumor 733 vs. 335 
CGH alterations 



E xpression change clusters 

8 Up-regulation 
0 Down-regulation 

2 No change 

3 Up-regulation 

2 Down regulation 
7 No change 

Tumor 827 vs. 532 



Concordance 



80% 



17% 



Concordance Expression change clusters CQH a , lerations 



Concordance 



16 Up-regulation 



21 Down-regulation 



1 5 No change 



11 Gain 

2 Loss 

3 No change 
1 Gain 

8 Loss 

12 No change 
3 Gain 

3 Loss 

9 No change 



69% 



38% 



60% 



17 Up-regulation 



9 Down-regulation 



21 No change 



10 Gain 

5 Loss 

2 No change 

0 Gain 

3 Loss 

6 No change 

1 Gain 
3 Loss 
17 No change 



59% 



33% 



81% 



two invasive tumors (stage pT1 , TCCs 733 and 827), whereas 
the two non-invasive papillomas (stage pTa, TCCs 335 and 
532) showed only 9p-, 9q22-q33-, and X-, arid 7+, 9q-. 
and Y- f respectively. Both invasive tumors showed changes 
(1q22-24+, 2q14.1-qter-. 3q12-q13.3-, 6q12-q22-, 
9q34+, 11q12-q13+, 17+, and 20q11.2-q12+) that are typ- 
ical for their disease stage, as well as additional alterations, 
some of which are shown in Fig. 1. Areas with gains and 
losses deviated from the normal copy number to some extent, 
and the average numerical deviation from normal was 0.4-fold 
in the case of TCC 733 and 0.3-fold for TCC 827. The largest 
changes, amounting to at least a doubling of chromosomal 
content, were observed at 1 q23 in TCC 733 (Fig. 1 A) and 
20q12in TCC 827 (Fig. 1S). 

mRNA Expression in Relation to DNA Copy Number-Tfte 
mRNA levels from the two invasive tumors (TCCs 827 and 
733) were compared with the two non-invasive counterparts 
(TCCs 532 and 335). This was done in two separate expert-, 
ments in which we compared TCCs 733 to 335 and 827 to 
532, respectively, using two different scaling settings for the 
arrays to rule out scaling as a confounding parameter. Ap- 
proximately 1 ,800 genes that yielded a signal on the arrays 
were searched in the Unigene and Genemap data bases for 
chromosomal location, and those with a known location 
(1096) were plotted as bars covering their purported locus. In 
that way it was possible to construct a graphic presentation of 
DNA copy number and relative mRNA levels along the Indi- 
vidual chromosomes (Fig. 1). 

For each mRNA a ratio was calculated between the level in 
the invasive versus the non-invasive counterpart. Bars, which 
represent chromosomal location of a gene, were color-coded 
according to the expression ratio, and only differences larger 



than 2-fold were regarded as informative (Fig. 1). The density 
of genes along the chromosomes varied, arid areas contain- 
ing only one gene were excluded from the calculations. The 
resolution of the QGH method is very low, and some of the 
outlier data may be because of the fact that the boundaries of 
the chromosomal aberrations are not known at high resolution. 
Two sets of calculations were made from the data. For the 
first set we used CGH alterations as the independent variable 
and estimated the frequency of expression alterations in these 
chromosomal areas. In general, areas with a strong gain of 
chromosomal material contained a cluster of genes having 
increased mRNA expression. For example, both chromo- 
somes 1q21-q25, 2p and 9q. showed a relative gain of more 
than 100% in DNA copy number that was accompanied by 
increased mRNA expression levels In the two tumor pairs (Fig. 
1). In most cases, chromosomal gains detected by CGH were 
accompanied by an increased level of transcripts In both 
TCCs 733 (77%) and 827 (80%) (Table I, top). Chromosomal 
losses, on the other hand, were not accompanied by de- 
creased expression in several cases, and were often regis- 
tered as having unaltered RNA levels (Table I, fop). The inabil- 
ity to detect RNA expression changes in these cases was not 
because of fewer genes mapping to the lost regions (data not 
shown). 

In the second set of calculations we selected expression 
alterations above 2-fold as the Independent variable and es- 
timated the frequency of CGH alterations in these areas. As 
above, we found that increased transcript expression corre- 
lated with gain of chromosomal material (TCC 733, 69% and 
TCC 827, 59%), whereas reduced expression was often der 
tected in areas with unaltered CGH ratios (Table I, bottom). 
Furthermore, as a control we looked at areas with no alter* 
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FK3 2 Correlation between maximum CGH aberration and the ability to detect expression change by oligonucleotide array 
monitoring. The aberration is shown as a numerical -fold change In ratio between invasive tumors 827 (A) and 733 (♦) and their non-invasive 
counterparts 532 and 335. The expression change was taken from the Expression line to the right in Rg. 1, which depicts the resulting 
expression change for a given chromosomal region. At least half of the mRNAs from a given region have to be either up- or down-regulated 
tobe scored as an expression change. All chromosomal arms in which the CGH ratio plus or minus one standard deviatton was outside the 
ratio value of one were Included. 



atiori in expression. No alteration was detected by CGH in 
most of these areas (TCC 733, 60% and TCC 827, 81 %; see 
Table I, bottom). Because the ability to observe reduced or 
increased mRNA expression clustering to a certain chromo- 
somal area clearly reflected the extent of copy number 
changes, we plotted the maximum CGH aberrations in the 
regions showing CGH changes against the ability to detect a 
change in mRNA expression as monitored by the oligonucleo- 
tide arrays (Fig. 2)£fpr both tumors TCC 733 (p < 0.015) and 
TCC 827 (p < 0.00003) a highly significant correlation was 
observed between the level of CGH ratio change (reflecting 
the DNA copy number) and alterations detected by the array 
based technology (Fig. 2^ Similar data were obtained when 
areas with altered expression were used as independent vari- 
ables. These areas correlated best with CGH when the CGH 
ratio deviated 1.6- to 2.0-fold (Table I, bottom) but mostly did 
not at lower CGH deviations. These data probably reflect that 
loss of an allele may only lead to a 50% reduction in expres- 
sion level, which is at the cut-off point for detection of expres- 
sion alterations. Gain of chromosomal material can occur to a 

much larger extent. 

Microsatellite-based Detection of Minor Areas of Loss- 
es—In TCC 733, several chromosomal areas exhibiting DNA 
amplification were preceded or followed by areas with a nor- 
mal CGH but reduced mRNA expression (see Rg. 1, TCC 733 
chromosome 1q32, 2p2T. and 7q21 and q32, 9q34, and 
10q22). To determine whether these results were because of 
undetected loss of chromosomal material in these regions or 



because of other non-structural mechanisms regulating tran- 
scription, we examined two microsatellites positioned at chro- 
mosome 1q25-32 and two at chromosome 2p22. Loss of 
heterozygosity (LOH) was found at both 1q25 and at 2p22 
indicating that minor deleted areas were not detected with the 
resolution of CGH (Fig. 3). Additionally, chromosome 2p in 
TCC 733 showed a CGH pattern of gain/no change/gain of 
DNA that correlated with transcript increase/decrease/in- 
crease. Thus, for the areas showing increased expression 
there was a correlation with the DNA copy number alterations 
. (Fig. 1 A). As indicated above, the mRNA decrease observed in 
the middle of the chromosomal gain was because of LOH, 
implying that one of the mechanisms for mRNA down-regu- 
lation may be regions that have undergone smaller losses of 
chromosomal material. However, this cannot be detected with 
the resolution of the CGH method. 

In both TCC 733 and TCC 827, the telomeric end of chro- 
mosome 1 1 p showed a normal ratio in the CGH analysis; 
however, clusters of five and three genes, respectively, lost 
their expressioa Two microsatellites (D11S1760, D11S922) 
positioned close to MUC2, IGF2, and cathepsin D indicated 
LOH as the most likely mechanism behind the loss of expres- 
sion (data not shown). 

A reduced expression of mRNA observed in TCC 733 at 
chromosomes 3q24, 11p11, 12p12.2, 12q21.1, and 16q24 
and in TCC 827 at chromosome 11p15.5, 12p11 ( 15q11.2, 
and 18q12 was also examined for chromosomal losses using 
microsatellites positioned as close as possible to the gene loci 
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Fig. 3. Microsateltite analysis of loss of heterozygosity. Tumor 
733 showing loss of heterozygosity at chromosome 1q25, detected 
(a) by D1S215 close to Hu class I histocompatibility antigen (gene 
number 38 In Rg. 1). (b) by Q1S2735 dose to cathepsin E (gene 
number 41 in Fig. 1), and (c) at chromosome 2p23 by D2S2251 close 
to general ^-spectrin (gene number 1 1 on Rg. 1) and of id) tumor 827 
showing loss of heterozygosity at chromosome 18q12 by S18S1118 
dose to mitochondrial 3-oxoacyl-coenzyme A thiolase (gene number 
12 in Fig. 1). The upper curves show the electropherogram obtained 
from normal DNA from leukocytes (A/), and the tower curves show the 
electropherogram from tumor DNA (7). In all cases one allele is 
partially lost in the tumor amplicon. 

showing reduced mRNA transcripts. Only the microsatellite 
positioned at 18q12 showed LOH <Rg. 3), suggesting that 
transcriptional down-regulation of genes in the other regions 
may be cpntrolled by other mechanisms. 

Relation between Changes in mRNA and Protein Levels - 
2D-PAGE analysis, in combination with Coomassie Brilliant 
Blue and/or silver staining, was carried out on all four tumors 
using fresh biopsy material. 40 well resolved abundant known 
proteins migrating in areas away from the edges of the pH 
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Fig. 4. Correlation between protein levels as judged by 20- 
PAGE and transcript ratio. For comparison proteins were divided In 
three groups, unaltered in level or up- or down-regulated {horizontal 
axis). The mRNA ratio as determined by oligonucleotide arrays was 
plotted for each gene (vertical axis). A, mRNAs that were scored as 
present in both tumors used for the ratio calculation; A, mRNAs that 
were scored as absent in the invasive tumors (along horizontal axis) or 
as absent in non-Invasive reference (top of figure). Two different 
scalings were used to exclude scaling as a confounder, TCCs 827 
and 532 (AA) were scaled with background suppression, and TCCs 
733 and 335 (#0) were scaled without suppression. Both compari- 
sons showed highly significant (p < 0.005) differences in mRNA ratios 
between the groups. Proteins shown, were as follows: Group A (from 
left), phosphoglucomutase 1, glutathione transferase class m number 
4, fatty acid-binding protein homologue, cytokeratin 15. and cytc^ 
keratin 13; B (from left), fatty acid-binding protein homologue, 28-kDa 
heat shock protein, cytokeratin 1 3, and calcyclin; C <from left), a-eno- 
lase, hnRNP B1, 28-kDa heat shock protein, 14-3-3-e, and 
pre-mRNA splicing factor, 0, mesotheiial keratin K7 (type II); B (from 
top), glutathione S-transferase-w and mesotheiial keratin K7 (type II); 
F(from top and /eft), adenytyl cyciase-associated protein, E-cadherin, 
keratin 19, calgizzarin, phosphoglycerate mutase, annexin IV. cy- 
tosketetal yaclin, hnRNP A1, integral membrane protein calnexin 
(IP90). hnRNP H, brain-type clathrin light chain-a, hnRNP F, 70-kDa 
heat shock protein, heterogeneous nuclear ribonucleoprotein A/B, 
translatiooally controlled tumor protein, liver glyceraldehyde-3-phos- 
phale dehydrogenase, keratin 8, aldehyde reductase, and Na,K* 
ATPase ^-1 subunrt; G, (from top and left), TCP20, calgi2zarin, 70- 
kDa heat shock protein, calnexin, hnRNP H, cytokeratin 15, ATP 
synthase, keratin 19, triosephosphate isomerase, hnRNP F, liver glyc- 
eraldehyde-3-phosphatase dehydrogenase, glutathione S-transfer- 
ase-ir, and keratin 8; H (from left), plasma gelsolin. autoantigen cai- 
reticulin, thioredoxin, and NAD+-dependent 15 hydroxyprostaglandin 
dehydrogenase; / (from fop), prolyl 4-hydroxylase 0-subunit, cyto- 
keratin 20, cytokeratin 17, prohibition, and fructose 1 t 6-biphos- 
phatase; J annexin II; K. annexin IV; L (from top and left), 90-kda heat 
shock protein, prolyl 4-hydroxylase p-subunit, o-enolase, GRP 78, 
cyctophflln, and cofitin. 

gradient, and having a known chromosomal location, were 
selected for analysis in the TCC pair 827/532. Proteins were 
identified by a combination of methods (see "Experimental 
Procedures"). In general there was a highly significant corre- 
lation (p < 0.005) between mRNA and protein alterations (Rg. 
4). Only one gene showed disagreement between transcript 
alteration and protein alteration. Except for a group of cyto- 
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FiC 5. Comparison of protein and transcript levels in invasive 
and "non-invasive TCCs. The upper part of the figure shows a 2D gel 
{left) and the oligonucleotide array ^fgftr) of TCC 532. The red rectan- 
gles on the upper gel highlight the areas that are compared below. 
Identical areas of 2D gels of TGCs 532 and 827 are shown below. 
Clearly, cytokeratins 13 and 15 are strongly down-regulated in TCC 
827 (red annotation). The tile on the array containing probes for 
cytokeratin 15 is enlarged below the array (red arrow) from TCC 532 
and is compared with TCC 827. The upper row of squares in each tile 
corresponds to perfect match probes; the lower row corresponds to 
mismatch probes containing a mutation (used for correction for un- 
specific binding). Absence of signal is depicted as black, and the 
higher the signal the lighter the color. A high transcript level was 
detected in TCC 532 (6151 units) whereas a much lower level was 
detected in TCC 827 (absence of signals). For cytokeratin 13, a high 
transcript level was also present in TCC 532 (15659 units), and a 
much lower level was present in TCC 827 (623 units). The 2D gels at 
the bottom of the figure {left) show levels of PA-FABP and adipocyte- 
FABP in TCCs 335 and 733 (invasive), respectively. Both proteins are 
down-regulated in the invasive tumor. To the right we show the array 
tiles for the PA-FABP transcript. A medium transcript level was de- 
tected in the case of TCC 335 (1277 units) whereas very low levels 
were detected In TCC 733 (166 units). /£F, isoelectric focusing. 



keratins encoded by genes on chromosome 17 (Fig. 5) the 
analyzed proteins did not belong to a particular family. 26 well 
focused proteins whose genes hiad a know chromosomal 
location were detected in TCCs 733 and 335, and of these 1 9 
correlated (p < 0.005) with the mRNA changes detected using 
the arrays (Fig. 4). For example, PA-FABP was highly ex- 
pressed in the non-invasive TCC 335 but lost In the invasive 
counterpart (TCC 733;' see Fig. 5). The smaller number of 
proteins detected In both 733 and 335 was because of the 
smaller size of the biopsies that were available. 

11 chromosomal regions where CGH showed aberrations 
that corresponded to the changes in transcript levels also 
showed corresponding changes in the protein level (Table II). 
These regions included genes that encode proteiris that are 
found to be frequently altered in bladder cancer, namely 
cytokeratins 17 and 20, annexins II and IV, and the fatty 
acid-binding proteins PA-FABP and FBP1. Four of these pro- 
teins were encoded by genes in chromosome 17q, a fre- 
quently amplified chromosomal area in Invasive bladder 
cancers. 

DISCUSSION 

Most human cancers have abnormal DNA content, having 
lost some chromosomal parts and gained others. The present 
study provides some evidence as to the effect of these gains 
and losses on gene expression in two pairs of non-invasive 
and invasive TCCs using high throughput expression arrays 
and proteomics, in combination with CGH. In general, the. 
results showed that there is a clear individual regulation of the 
mRNA expression of single genes, which in some cases was 
superimposed by a DNA copy number effect In most cases, 
genes located in chromosomal areas with gains often exhib- 
ited increased mRNA expression, whereas areas showing 
losses showed either no change or a reduced mRNA expres- 
sion. The latter might be because of the fact that tosses most 
often are restricted to loss of one allele, and the cut-off point 
for detection of expression alterations was a 2-fold change, 
thus being at the border of detection. In several cases, how- 



Table II 

Proteins whose expression level correlates with both mRNA and gene dose changes 



Protein 



Chromosomal location Tumor TCC CGH alteration Transcript alteration* Protein alteration 



Annexin U 
Annexin IV 
Cytokeratin 17 
Cytokeratin 20 
(PA-)FABP 
FBP1 

Plasma geisolin 
Heal shock protein 26 
Prohibits 
Prolyl-4-hydroxyl 
hnBNPBI 



1q21 

2p13 .. 

17q12-q21 

17q21.1 

8q21.2 

9q22 

9q31 

15q12-q13 
17q21 
17q25 
7p15 



733 

733 

827 

827 

827 

827 

827 

827 
827/733 
827/733 

827 



Gain 
Gain 
Gain 
Gain 
Loss 
Gain 
Gain 
Loss 
Gain 
Gain 
Loss 



Absto Pres" 
3.9-Fold up 
3.8-Fold up 

5.6- Fold up 
10-Fold down 
2.3-Fold up 
Abs to Pres 
2.5-Fold up 

3.7- /2.5-Fold up* 
5.7-/1 .6-Fold up 
2.5-Fold down 



Increase 

increase 

Increase 

increase 

Decrease 

Increase 

Increase 

Decrease 

Increase 

Increase 

Decrease 



* Abs,-absent; Pres, present ' n „_,_, 00 

* In cases where the corresponding alterations were found in both TCCs 827 and 733 these are shown as 827/733. 




ever, an increase or decrease in DNA copy number was 
associated with de novo occurrence or complete loss of tran- 
script, respectively. Some of these transcripts could not be 
detected in the non-invasive tumor but were present at rela- 
tively high levels in areas with DNA amplifications In the inva- 
sive tumors (e.g. in TCC 733 transcript from cellular ligand of 
annexin II gene (chromosome 1q21) from absent to 2670 
arbitrary units; in TCC 827 transcript from small proline-rich 
protein 1 gene (chromosome 1q12-q21.1) from absent to 
1326 arbitrary units). It may be anticipated from these data 
that significant clustering of genes with an Increased expres- 
sion to a certain chromosomal area Indicates an increased 
likelihood of gain of chromosomal material in this area 

Considering the many possible regulatory mechanisms act- 
ing at the level of transcription, it seems striking that the gene 
dose effects were so clearly detectable in gained areas. One 
hypothetical explanation may lie in the loss of controlled 
methylation in tumor cells (17-19). Thus, it may be possible 
' that in chromosomes with increased DNA copy numbers two 
or more alleles could be demethyiated simultaneously leading 
to a higher transcription level, whereas in chromosomes with 
losses the remaining allele could be partly methylated, turning 
off the process (20, 21). A . recent report has documented a 
ploidy regulation of gene expression in yeast, but in this case all 
the genes were present in the same ratio (22), a situation that is 
not analogous to that of cancer cells, which show marked 
chromosomal aberrations, as well as gene dosage effects. 

Several CGH studies of bladder cancer have shown that 
some chromosomal aberrations are common at certain 
stages of disease progression, often occurring in more than 1 
of 3 tumors, in pTa tumors, these include 9p- t 9q-, 1q+, Y- 
(2, 6), and in pT1 tumors, 2q-,11p~, 11q~. 1q + . 5p+„8q+. 
17q+, and 20q+ (2-4, 6 ( 7). The pTa tumors studied here 
showed similar aberrations such as 9p- and 9q22-q33- and 
9q - and Y- respectively. Likewise, the two minimal Invasive 
pTI tumors showed aberrations that are commonly seen at 
that stage, and TCC 827 had a remarkable resemblance to the 
commonly seen pattern of losses and gains, such as 1q22-24 
amplification (seen in both tumors). 11q14-q22 loss, the latter 
often linked to 17 q+ (both tumors), and 1q+ and 9p-. often 
linked to 20q+ and 11 q13+ (both tumors) (7-9). These ob- 
servations indicate that the pairs of tumors used in this study 
exhibit chromosomal changes observed in many tumors, and 
therefore the findings could be of general importance for 

bladder cancer. 

Considering that the mapping resolution of CGH is of about 
20 megabases it is only possible to get a crude picture of 
chromosomal instability using this technique. Occasionally, 
we observed reduced transcript levels close to or inside re- 
gions with increased copy numbers. Analysis of these regions 
by positioning heterozygous microsatellites as close as pos- 
sible to the locus showing reduced gene expression revealed 
loss of heterozygosity in several cases. It seems likely that 
multiple and different events occur along each chromosomal 



arm and that the use of cDNA microarrays for analysis of DNA 
copy number changes will reach a resolution that can resolve 
these changes, as has recently been proposed (2). The outlier 
data were not more frequent at the boundaries of the CGH 
aberrations. At present we do not know the mechanism be- 
hind chromosomal aneuploldy and cannot predict whether 
chromosomal gains will be transcribed to a larger extent than 
the two native alleles. A mechanism as genetic Imprinting has 
an impact on the expression level in normal cells and is often 
reduced in tumors. However, the relation between imprinting 
and gain of chromosomal material is hot known. 

We regard it as a strength of this investigation that we were 
able to compare invasive tumors to benign tumors rather than 
to normal urothelium, as the tumors studied were biologically 
very close and probably may represent successive steps in 
the progression of bladder cancer. Despite the limited amount 
of fresh tissue available it was possible to apply three different 
state of the art methods. The observed correlation between 
DNA copy number and mRNA expression is remarkable when 
one considers that different pieces of the tumor biopsies were 
used for the different sets of experiments. This indicate that 
bladder tumors are relatively homogenous, a notion recently 
supported by CGH and LOH data that showed a remarkable 
similarity even between tumors and distant metastasis (10, 23). 

In the few cases analyzed, mRNA and protein levels 
showed a striking correspondence although in some cases 
we found discrepancies that may be attributed to translations 
regulation, post-translational processing, protein degrada- 
tion, or a combination of these. Some transcripts belong to 
under-translated mRNA pools, which are associated with few 
translationally inactive ribosomes; these pools, however, 
seem to be rare (24). Protein degradation, for example, may 
be very important in the case of polypeptides with a short 
half-life (e.g. signaling proteins). A poor correlation between 
mRNA and protein levels was found in liver cells as deter- 
mined by arrays and 2D- PAGE (25), and a moderate correla- 
tion was recently reported by Ideker et al. (26) in yeast. 
(Interestingly, our study revealed a much better correlation 
between gained chromosomal areas and increased mRNA 
levels than between loss of chromosomal areas and reduced 
mRNA levels. In general, the level of CGH change determined 
the ability to detect a change in transcript) One possible 
explanation could be that by losing one allele the change in 
mRNA level Is not so dramatic as compared with gain of 
material, which can be rather unlimited and may lead to a 
severalfold increase in gene copy number resulting in a much 
higher impact on transcript level. The latter would be much 
easier to detect on the expression arrays as the cut-off point 
was placed at a 2-fold level so as not to be biased by noise on 
the array. Construction of arrays with a better signal to noise 
ratio may in the future allow detection of lesser than 2-fold 
alterations in transcript levels, a feature that may facilitate the 
analysis of the effect of loss of chromosomal areas on tran- 
script .levels. 
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In eleven cases we found a significant correlation between 
DNA copy number, mRNA expression; and protein level. Four 
of these proteins were encoded by genes located at a fre- 
quently amplified area in chromosome 17q. Whether DNA 
copy number is one of the mechanisms behind alteration of 
these eleven proteins is at present unknown and will have to 
be proved by other methods using a larger number of sam- 
ples One factor making such studies complicated is the large 
extent of protein modification that occurs after translation, 
requiring immunoidentification and/or mass spectrometry to 
correctly identify the proteins in the gels. 

In conclusion, the results presented in this study exemplify 
the large body of knowledge that may be possible to gather in 
the future by combining state of the art techniques that follow 
the pathway from DNA to protein (26). Here, we used a tradi- 
tional chromosomal CGH method, but in the future high reso- 
lution CGH based on microarrays with many thousand radiation 
hybrid-mapped genes will Increase the resolution and informa- 
tion derived from these types of experiments (2). Combined with 
expression arrays analyzing transcripts derived from genes with 
known locations, and 2D gel anatysis to obtain information at 
the post-translational level, a clearer and more developed un- 
derstanding of the tumor genome will be forthcoming. 
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ABSTRACT 

Genetic changes underlie tumor progression and may lead to cancer- 
specific expression of critical genes. Over 1 J 00 publications have de- 
scribed the use of comparative genomic hybridization (CGH) to analyze 
the pattern of copy number alterations in cancer, but very few of the genes 
affected are known. Here, wc performed high-resolution CGH analysis on 
cDNA microarrays in breast cancer and directly compared copy number 
and mRNA expression levels of 13,824 genes to quantitate the impact of 
genomic changes on gene expression. We identified and mapped the 
boundaries of 24 independent amplicons, ranging in size from 0.2 to 12 
Mb. Throughout the genome, both high- and low-level copy number 
changes had a substantial impact on gene expression, with 44% of the 
highly amplified genes showing overexprcssion and 10.5% of the highly 
overcxpressed genes being amplified. Statistical analysis with random 
permutation tests identified 270 genes whose expression levels across 14 
samples were systematically attributable to gene amplification. These 
included most previously described amplified genes in breast cancer and 
many novel targets for genomic alterations, including the HOXB7 gene, 
the presence of which in a novel amplicon at 17q213 was validated in 
10.2% of primary breast cancers and associated with poor patient prog- 
nosis. In conclusion, CGH on cDNA microarrays revealed hundreds of 
novel genes whose overexpression is attributable to gene amplification. 
These genes may provide insights to the clonal evolution and progression 
of breast cancer and highlight promising therapeutic targets. 

INTRODUCTION 

Gene expression patterns revealed by cDNA microarrays have 
facilitated classification of cancers into biologically distinct catego- 
ries, some of which may explain the clinical behavior of the tumors 
(1-6). Despite this progress in diagnostic classification, the molecular 
mechanisms underlying gene expression patterns in cancer have re- 
mained elusive, and the utility of gene expression profiling in the 
identification of specific therapeutic targets remains limited. 

Accumulation of genetic defects is thought to underlie the clonal 
evolution of cancer. Identification of the genes that mediate the effects 
of genetic changes may be important by highlighting transcripts that 
are actively involved in tumor progression. Such transcripts and their 
encoded proteins would be ideal targets for anticancer therapies, as 
demonstrated by the clinical success of new therapies against ampli- 
fied oncogenes, such as ERBB2 and EGFR (7, 8), in breast cancer and 
other solid tumors. Besides amplifications of known oncogenes, over 
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Fig. 1 . Impact of gene copy number on global gene expression levels. A. percentage of 
over- and undercxprcsscd genes (Y axis) according to copy number ratios {K axis). 
Threshold values used for over- and undcrcxpression were >2.184 {global upper 7% of 
the cDNA ratios) and <0.4826 (global lower 7% of the expression ratios). B, percentage 
of amplified and deleted genes according to expression ratios. Threshold values for 
amplification and deletion were > 1 ,5 and <0.7. 



20 recurrent regions of DNA amplification have been mapped j*d 
breast cancer by CGH 5 (9, 10). However, these amplicons are often 
large and poorly defined, and their impact on gene expression remains 
unknown. 

We hypothesized that genome-wide identification of those gene 
expression changes that are attributable to underlying gene copy 
number alterations would highlight transcripts that are actively in- 
volved in the causation or maintenance of the malignant phenotype. 
To identify such transcripts, we applied a combination of cDMA and 
CGH nucroarrays to: (a) determine the global impact that gene copy 
number variation plays in breast cancer development and progression; 
and (£>) identify and characterize those genes whose mRNA expres- 



s The abbreviations used are: CGH, comparative genomic hybridization; FISH, fluo- 
rescence in situ hybridization; RT-PCR, reverse transcripuon-PCR. 
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GENE EXPRESSION PATTERNS IN BREAST CANCER 




Fig. 2. Genome-wide copy number and expression analysis in the MCF-7 breast cancer cell line. A chromosomal CGH analysis of MCF-7. The copy number ratio profile (blue 
tine) across the entire genome from lp telomere to Xq telomere is shown along with ±1 SD (orange lines). The black horizontal line indicates a ratio of 1.0; red line, a ratio of 0.8; 
and green line, a ratio of 1 2. B-C, genome-wide copy number analysis m MCF-7 by CGH on cDNA microarray. The copy number ratios were plotted as a function of the position 
of the cD>f A clones along the human genome. In B> individual data points axe connected with a tine, and a moving median of 10 adjacent clones is shown. Red horizontal line, the 
copy number ratio of 1 .0. In C f individual data points arc labeled by color coding according to cDN A expression ratios. The bright red dots indicate the upper 2%, and dark red dois, 
the next 5^> of the expression ratios in MCF-7 cells (overcxpressed genes); bright green dots indicate the lowest 2%, and dark green dots, the next $% of the expression ratios 
(undercxpressed genes); the rest of the observations are shown with' black crosses. The chromosome numbers are shown at the bottom of the figure, and chromosome boundaries are 
indicated with a dashed fine. 



sion is most significantly associated with amplification of the corre- 
sponding genomic template. 

MATERIALS AND METHODS 

Breast Cancer Cell Lines. Fourteen breast cancer cell lines (BT-20, BT- 
474, HCCI428, Hs578t, MCF7, MDA-361, MDA-436, MDA-453, MDA-468, 
SKBR-3, T-47D, UACC812, ZR-75-1, and ZR-75-30) were obtained from the 
American Type Culture Collection (Manassas, VA). Cells were grown under 
recornrnended culture conditions. Genomic DNA and mRNA were isolated 
using standard protocols. 

Copy Number and Expression Analyses by cDNA Microarrays. The 
preparation and printing of the 13,824 cDNA clones on glass slides were 
performed as described (11—13). Of these clones, 244 represented uncharac- 
terized expressed sequence tags, and the remainder corresponded to known 
genes. CGH experiments on cDNA microarrays were done as described (14, 
15). Briefly, 20 of genomic DNA from breast cancer cell lines and normal 
human WBCs were digested for 14-18 h with AM and Rsal (Life Technol- 
ogies, Inc., Rockville, MD) and purified by phenol/chloroform extraction. Six 
MS of, digested cell line DNAs were labeled with Cy3-dUTP (Amersham 
Pharmacia) and normal DNA with Cy5-dUTP (Amersham Pharmacia) using 
the Bioprime Labeling kit (Life Technologies, Inc.). Hybridization (14, 15) and 
posthybridization washes (13) were done as described. For the expression 
analyses, a standard reference (Universal Human Reference RNA; Stratagene, 
La Jolla,' CA) was used in all experiments. Forty jig of reference RNA were 
labeled with Cy3-dUTP and 3.5 jtg of test mRNA with Cy5-dUT?, and the 
labeled cDNAs were hybridized on microarrays as described (13, 1 5). For both 
microarray analyses, a laser confocal scanner (Agilent Technologies, Palo 
Alto, CA) was used to measure the fluorescence intensities at the target 
locations using the DEARRAY software (16). After background subtraction, 
average intensities at each clone in the test hybridization were divided by the 
average intensity of the corresponding clone in the control hybridization. For 
the copy number analysis, the ratios were normalized on the basis of the 
distribution of ratios of all targets on the array and for the expression analysis 
on the basis of 88 housekeeping genes, which were spotted four times onto the 
arrav. Low quality measurements (i.e., copy number data with mean reference 
intensity <100 fluorescent units, and expression data with both test and 
reference intensity <100 fluorescent units and/or with spot size <50 units) 
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were excluded from the analysis and were treated as missing values. The 
distributions of fluorescence ratios were used to define outpoints for increased/ 
decreased copy number. Genes with CGH ratio >1.43 (representing the upper 
5% of the CGH ratios across all experiments) were considered to be amplified, 
and genes with ratio <0.73 (representing the lower 5%) were considered to be 
deleted. 

Statistical Analysis of CGH and cDNA Microarray Data. To evaluate 
the influence of copy number alterations on gene expression, we applied the 
following statistical approach. CGH and cDNA calibrated intensity ratios were 
log-transformed and normalized using median centering of the values in each 
cell line. Furthermore, cDNA ratios for each gene across all 14 cell lines were 
median centered. For each gene, the CGH data were represented by a vector 
that was labeled 1 for amplification (ratio, > 1 .43) and 0 for no amplification. 
Amplification was correlated with gene expression using the signal-to-noise 
statistics (1). We calculated a weight, w^, for each gene as follows: 

W * " y + <r e o 

where m gU <r sX and o^ denote the means and SDs for the expression 
levels for amplified and oonamplified cell lines, respectively. To assess the 
statistical significance of each weight, we performed 10,000 random permu- 
tations of the label vector. The probability that a gene had a larger or equal 
weight by random permutation than the original weight was denoted by a. A 
low a (<0.05) indicates a strong association between gene expression and 
amplification. 

Genomic Localization of cDNA Clones and Amplicon Mapping. Each 
cDNA clone on the microarray was assigned to a Unigene cluster using the 
Unigene Build 14 1. 6 A database of genomic sequence alignment information 
for mRNA sequences was created from the August 2001 freeze of the Uni- 
versity of California Santa Cruz's GoldenPath database. 7 The chromosome and 
bp positions for each cDNA clone were then retrieved by relating these data 
sets. Amplicons were defined as a CGH copy number ratio >2>0 in at least two 
adjacent clones in two or more cell lines or a CGH ratio >2.0 in at least three 
adjacent clones in a single cell line. The amplicon start and end positions were 



6 Internet' address: http -7/research.nh gri ji ih. gov/fei croarray/do wrUoadab te_cdna.html . 

7 Internet address: www.gcnome.ucsc.edu. 
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Table 1 Summary of independent amp! icons in 14 breast cancer cell lines by 
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CGH were validated, with lq21, 17ql2-q21.2, 17q22-q23, 20ql3.1, 
and 20ql3.2 regions being most commonly amplified. Furthermore, 
the boundaries of these amp li cons were precisely delineated. In ad- 
dition, novel amplicons were identified at 9pl3 (38.65-39.25 Mb), 
and 17q21.3 (52.47-55.80 Mb). 

Direct Identification of Putative Amplification Target Genes. 
The cDNA/CGH microarray technique enables the direct correla- 
tion of copy number and expression data on a gene-by-gene basis 
throughout the genome. We directly annotated high-resolution 
CGH plots with gene expression data using color coding. Fig. 2C 
shows that most of the amplified genes in the MCF-7 breast cancer 
cell line at lp!3, 17q22-q23, and 20ql3 were highly overex- 
pressed. A view of chromosome 7 in the MDA-468 cell line 
implicates EGFR as the most highly overexpressed and amplified 
gene at 7pl l-pl2 (Fig. 3A). In BT-474, the two known amplicons 
at 17ql2 and 17q22-q23 contained numerous highly overex- 
pressed genes (Fig. 3£). In addition, several genes, including the 
homeobox genes HOXB2 and HOXB7, were highly amplified in a 
previously undescribed independent amplicon at 17q21.3, HOXB7 
was systematically amplified (as validated by FISH, Fig. 35, inset) 
as well as overexpressed (as verified by RT-PCR, data not shown) 
in BT-474, UACC812, and ZR-75-30 cells. Furthermore, this novel 



extended to include neighboring nonamplified clones (ratio, <1.5). The am- 
plicon size determination was partially dependent on local clone density. 

FISH. Dual-color interphase FISH to breast cancer cell lines was done as 
described (17), Bacterial artificial chromosome clone RP11-361K8 was la- 
beled with SpectrumOrange (Vysis, Downers Grove, IL), and Spectrum- 
Orange-labeled probe for EGFR was obtained from Vysis. SpectrumGreen- 
labeled chromosome 7 and 17 centromere probes (Vysis) were used as a 
reference. A tissue microarray containing 612 formalin-fixed, paraffin-embed- 
ded primary breast cancers (17) was applied in FISH analyses as described 
(18). The use of these specimens was approved by the Ethics Committee of the 
University of Basel and by the NIH. Specimens containing a 2-fold or higher 
increase in the number of test probe signals, as compared with corresponding 
centromere signals, in at least 1 0% of the tumor cells were considered to be 
amplified. Survival analysis was performed using the Kaplan-Meier method 
and the log-rank test. 

RT-PCR. The HOXB7 expression level was determined relative to 
GAPDfir Reverse transcription and PCR amplification were performed using 
Access RT-PCR System (Proxnega Corp., Madison, WI) with 10 ng of mRNA 
as a template. HOXB7 primers were 5 '-GAGCAGAGGGACTCGGACTT-3 ' 
and 5 ' -GCGTC AGGT AGCG ATTGTAG-3 ' . 

RESULTS 

Global Effect of Copy Number on Gene Expression. 13,824 
arrayed cDNA clones were applied for analysis of gene expression 
and gene copy number (CGH microarrays) in 14 breast cancer cell 
lines. The results illustrate a considerable influence of copy number 
on gene expression patterns. Up to 44% of the highly amplified 
transcripts (CGH ratio, >2.5) were overexpressed (i.e., belonged to 
the global upper 7% of expression ratios), compared with only 6% for 
genes with normal copy number levels (Fig. 1 A). Conversely, 10.5% 
of the transcripts with high-level expression (cDNA ratio, >10) 
showed increased copy number (Fig- IB). Low-level copy number 
increases and decreases were also associated with similar, although 
less dramatic, outcomes on gene expression (Fig. 1). 

Identification of Distinct Breast Cancer Amplicons. Base-pair 
locations obtained for 1 1,994 cDNAs (86.8%) were used to plot copy 
number changes as a function of genomic position (Fig. 2, Supple- 
ment Fig. A), The average spacing of clones throughout the genome 
was 267 kb. This high-resohition mapping identified 24 independent 
breast cancer amplicons, spanning from 0.2 to 12 Mb of DNA (Table 
I). Several amplification sites detected previously by chromosomal 
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Fig. 3. Annotation of gene expression data on CGH microarray profiles. A, genes in the 
7pt J-pl2 amplicon in the MDA-468 eel! line arc highly expressed (red dots) and include 
the 'EGFR oncogene, B, severai genes in the I7ql2, 17q21.3, and J7q23 amplicons in the 
BT-474 breast cancer cell line are highly overexpressed {red) and include the HOXB7 
gene. The data labels and color coding are as indicated for Fig. 2C Insets show 
chromosomal CGH profiles for the corresponding chromosomes and validation of the 
increased copy number by interphase FISH using EGFR {red) and chromosome 7 
centromere probe (green) to MDA-468 {A) and #G£B7-spetific probe {red) and chro- 
mosome 17 centromere (green) to BT-474 cells (£). 
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Fig. 4. List of 50 genes with a statistically 
significant correlation (a value <0.05) between 
gene copy number and gene expression. Name, 
chromosomal location, and the or value for each 
gene are indicated. The genes have been ordered 
according to their position in the genome. The color 
maps on the right illustrate the copy number and 
expression ratio patterns in the 14 cell tines. The 
key to the color code is shown at the bottom of the 
graph. Gray squares, missing values. The complete 
list of 270 genes is shown in supplemental Fig. B, 
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amplification was validated to be present in 10.2% of 363 primary 
breast cancers by FISH to a tissue microarray and was associated 
with poor prognosis of the patients (P = 0.001). 

Statistical Identification and Characterization of 270 Highly 
Expressed Genes in Amplicons. Statistical comparison of expres- 
sion levels of all genes as a function of gene amplification identified 
270 genes whose expression was significantly influenced by copy 
number across all 14 cell lines (Fig. 4, Supplemental Fig. B). Accord- 
ing to the gene ontology data, 8 91 of the 270 genes represented 
hypothetical proteins or genes with no functional annotation, whereas 
179 had associated functional information available. Of these, 151 
(84%) are implicated in apoptosis, cell proliferation, signal transduc- 
tion, and transcription, whereas 28 (16%) had functional annotations 
that could not be directly linked with cancer. 



DISCUSSION , 

The importance of recurrent gene and chromosome copy number 
changes in the development and progression of solid tumors has been 
characterized in >1000 publications applying CGH 9 (9, 10), as well 
as in a large number of other molecular cytogenetic, cytogenetic, and 
molecular genetic studies. The effects of these somatic genetic 
changes on gene expression levels have remained largely unknown, 
although a few studies have explored gene expression changes occur- 
ring in specific amplicons (15, 19-21). Here, we applied genome- 
wide cDNA microarrays to identify transcripts whose expression 
changes were attributable to underlying gene copy number alterations 
in breast cancer. 

The o verall impact of copy number on gene expression patterns was 
substantial with the most dramatic effects seen in the case of high- 



* Internet address: http://www. gcneontology.org/. 



9 Internet address: http^/www.ncbi.nlra.nih.gov/enirez. 
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level copy number increase. Low-level copy number gains and losses 
also had a significant influence on expression levels of genes in the 
regions affected, but these effects were more subtle on a gene-by-gene 
basis than those of high-level amplifications. However, the impact of 
low-level gains on the dysregulation of gene expression patterns in 
cancer may be equally important if not more important than that of 
high-level amplifications. Aneuploidy and low-level gains and losses 
of chromosomal arms represent the most common types of genetic 
alterations in breast and other cancers and, therefore, have an influ- 
ence on xnany genes. Our results in breast cancer extend the recent 
studies on the impact of aneuploidy on global gene expression pat- 
terns in yeast cells, acute myeloid leukemia, and a prostate cancer 
model system (22-24), 

The CGH microarray analysis identified 24 independent breast 
cancer amplicons. We defined the precise boundaries for many am- 
plicons detected previously by chromosomal CGH (9, 10, 25, 26) and 
also discovered novel amplicons that had not been detected previ- 
ously, presumably because of their small size (only 1-2 Mb) or close 
proximity to other larger amplicons. One of these novel amplicons 
involved the homeobox gene region at 17q21.3 and led to the over- 
expression of the HOXB7 and HOXB2 genes. The homeodomain 
transcription factors are known to be key regulators of embryonic 
development and have been occasionally reported to undergo aberrant 
expression in cancer (27, 28). HOXB7 transfection induced cell pro- 
liferation in melanoma, breast, and ovarian cancer cells and increased 
tumorigenicity and angiogenesis in breast cancer (29-32). The pres- 
ent results imply that gene amplification may be a prominent mech- 
anism for overexpressing HOXB7 in breast cancer and suggest that 
HOXB7 contributes to tumor progression and confers an aggressive 
disease phenotype in breast cancer. This view is supported by our 
finding of amplification of HOXB7 in 10% of 363 primary breast 
cancers, as well as an association of amplification with poor prognosis 
of the patients. 

We carried out a systematic search to identify genes whose 
expression levels across all 14 c.ell lines were attributable to 
amplification status. Statistical analysis revealed 270 such genes 
(representing —2% of all genes on the array), including not only 
previously described amplified genes, such as HER-2, MYC> 
EGFR, ribosomal protein s6 kinase, and AIB3, but also numerous 
novel genes such as NRAS-related gene (lpl3), syndecan-2 (8q22), 
and bone morphogenic protein (20ql3J), whose activation by 
amplification may similarly promote breast cancer progression. 
Most of the 270 genes bave not been implicated previously in 
breast cancer development and suggest novel pathogenetic mech- 
anisms. Although we would not expect all of them to be causally 
involved, it is intriguing that 84% of the genes with associated 
functional information were implicated in apoptosis, cell prolifer- 
ation, signal transduction, transcription, or other cellular processes 
that could directly imply a possible role in cancer progression. 
Therefore* a detailed characterization of these genes may provide 
biological insights to breast cancer progression and might lead to 
the development of novel therapeutic strategies. 

In summary, we. demonstrate application of cDNA microarrays 
to the analysis of both copy number and expression levels of over 
12,000 transcripts throughout the breast cancer genome, roughly 
once every 267 kb. This analysis provided: (a) evidence of a 
prominent global influence of copy number changes on gene 
expression levels; (b) a high-resolution map of 24 independent 
amplicons in breast cancer; and (c) identification of a set of 270 
genes, the overexpression of which was statistically attributable to 
gene amplification. Characterization of a novel amplicon at 
17q21-3 implicated amplification and overexpression of the 
HOXB7 gene in breast cancer, including a clinical association 



between HOXB 7 amplification and poor patient prognosis. Overall, 
our results illustrate how the identification of genes activated by 
gene amplification provides a powerful approach to highlight 
genes with an important role in cancer as well as to prioritize and 
validate putative targets for therapy development. 
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Genomic DNA copy number alterations are key genetic events in 
the development and progression of human cancers. Here we 
report a genome-wide microarray comparative genomic hybrid- 
isation (array CGH) analysis of DNA copy number variation in 
a series of primary human breast tumors. We have profiled DNA 
copy number alteration across 6,691 mapped human genes, in 44 
predominantly advanced, primary breast tumors and 10 breast 
cancer cell lines. While the overall patterns of DNA amplification 
and deletion corroborate previous cytogenetic studies, the high- 
resolution (gene- by-gene) mapping of amplicon boundaries and 
trie quantitative analysis of amplicon shape provide significant 
improvement in the localization of candidate oncogenes. Parallel 
microarray measurements of mRNA levels reveal the remarkable 
degree to which variation in gene copy number contributes to 
variation in gene expression in tumor cells. Specifically, we find 
that 62% of highly amplified genes show moderately or highly 
elevated expression, that DNA copy number influences gene ex- 
pression across a wide range of DNA copy number alterations 
(deletion, low-, mid- and high-level amplification), that on average, 
a 2-fold change in DNA copy number is associated with a corre- 
sponding 1.5-fold change in mRNA levels, and that overall, at least 
12% of all the variation in gene expression among the breast 
tumors is directly attributable to underlying variation in gene copy 
number. These findings provide evidence that widespread DNA 
copy number alteration can lead directly to global deregulation of 
gene expression, which may contribute to the development or 
progression of cancer. 

Conventional cytogenetic techniques, including comparative 
genomic hybridization (CGH) (1), have led to the identifi- 
cation of a number of recurrent regions of DNA copy number 
alteration in breast cancer cell lines and tumors (2-4). While 
some of these regions contain known or candidate oncogenes 
[e.g., FGFR1 .(8pll), MYC (8q24), CCND1 (llql3), ERBB2 
(17ql2), and ZNF217 (20ql3)] and tumor suppressor genes 
[RBI (13ql4) and TP53 (17pl3)], the relevant gene(s) within 
other regions (e.g., gain of lq, 8q22, and 17q22-24, and loss of 
8p) remain to be identified. A high -resolution genome-wide 
map, delineating the boundaries of DNA copy number alter- 
ations in tumors, should facilitate the localization and identifi- 
cation of oncogenes and tumor suppressor genes in breast 
cancer. In this study, we have created such a map, using 
array-based CGH (5-7) to profile DNA copy. number alteration 
in a series of breast cancer cell lines and primary tumors. 

An unresolved question is the extent to which the widespread 
DNA copy number changes that we and others have identified 
in breast tumors alter expression of genes within involved 
regions. Because we had measured mRNA levels in parallel in 
the same samples (8), using the same DNA microarrays, we had 
an opportunity to explore on a genomic scale the relationship 
between DNA copy number changes and gene expression. From 



this analysis, we have identified a significant impact of wide- 
spread DNA copy number alteration on the transcriptional 
programs of breast tumors. 

Materials and Methods 

Tumors and Cell Lines. Primary breast tumors were predominantly 
large (>3 cm), intermediate-grade, infiltrating ductal carcino- 
mas, with more than 50% being lymph node positive. The 
fraction of tumor cells within specimens averaged at least 50%. 
Details of individual tumors have been published (8, 9), and 
are summarized in Table 1, which is published as supporting 
information on the PNAS web site, www.pnas.org. Breast cancer 
cell lines were obtained from the American Type Culture 
Collection. Genomic DNA was isolated either using Qiagen 
genomic DNA columns, or by phenol/chloroform extraction 
followed by ethanol precipitation. 

DNA Labeling and Microarray Hybridizations. Genomic DNA label- 
ing and hybridizations were performed essentially as described 
in Pollack et aL (7), with slight modifications. Two micrograms 
of DNA was labeled in a total volume of 50 microliters and the 
volumes of all reagents were adjusted accordingly. "Test" DNA 
(from tumors and cell lines) was fluorescently labeled (Cy5) and 
hybridized to a human cDNA microarray containing 6,691 
different mapped human genes (i.e., UniGene clusters). The 
"reference" (labeled with Cy3) for each hybridization was nor- 
mal female leukocyte DNA from a single donor. The fabrication 
of cDNA microarrays and the labeling and hybridization of 
mRNA samples have been described (8). 

Data Analysis and Map Positions. Hybridized arrays were scanned 
on a GenePix scanner (Axon Instruments, Foster City, CA), and 
fluorescence ratios (test/reference) calculated using scanalyze 
software (available at http://rana.lbLgov). Fluorescence ratios 
were normalized for each array by setting the average log 
fluorescence ratio for all array elements equal to 0. Measure- 
ments with fluorescence intensities more than 20% above back- 
ground were considered reliable. DNA copy number profiles 
that deviated significantly from background ratios measured in 
normal genomic DNA control hybridizations were interpreted as 
evidence of real DNA copy number alteration (see Estimating 
Sigriificance of Altered Fluorescence Ratios in the supporting 
information). When indicated, DNA copy number profiles are 
displayed as a moving average (symmetric 5-nearest neighbors). 
Map positions for arrayed human cDNAs were assigned by 



Abbreviation: CGH, comparative genomic hybridization. 
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Fig. 1- Genome-wide measurement of DNA copynumber alteration by array CGH. (a) DNA copy numberprofites are illustrated for cell lines containing different 
numbers of X chromosomes, for breast cancer cell lines, and for breast tumors. Each row represents a different cell line or tumor, and each column represents 
one of 6,691 different mapped human genes present on the microarray, ordered by genome map position from Ipterthrough Xqter. Moving average (symmetric 
5 -nearest neighbors) fluorescence ratios (test/reference) are depicted using a !og 2 -based pseudocolor scale (indicated), such that red luminescence reflects 
fold-amplification, green luminescence reflects fold-deletion, and black indicates no change (gray indicates poorly measured data). (6) Enlarged view of DNA 
copy number profiles across the X chromosome, shown for cell lines containing different numbers of X chromosomes. 



identifying the starting position of the best and longest match of 
any DNA sequence represented in the corresponding UniGene 
cluster (10) against the "Golden Path" genome assembly 
(http://genome.ucsc.edu/; Oct 7, 2000 Freeze). For UniGene 
clusters represented by multiple arrayed elements, mean fluo- 
rescence ratios (for all elements representing the same UniGene 
cluster) are reported. For mRNA measurements, fluorescence 
ratios are "mean-centered" (i.e., reported relative to the mean 
ratio across the 44 tumor samples). The data set described here 
can be accessed in its entirety in the supporting information. 

Results 

We performed CGH on 44 predominantly locally advanced, 
primary breast tumors and 10 breast cancer cell lines, using 
cDNA microarrays containing 6,691 different mapped human 
genes (Fig. la; also see Materials and Methods for details of 
microarray hybridizations). To take full advantage of the im- 
proved spatial resolution of array CGH, we ordered (fluores- 
cence ratios for) the 6,691 cDNAs according to the "Golden 
Path", (http://genome.ucsc.edu/) genome assembly of the draft 
human genome sequences (11). In so doing, arrayed cDNAs not 
only themselves represent genes of potential interest (e.g., 
candidate oncogenes within ampJicons), but also provide precise 
genetic landmarks for chromosomal regions of amplification and 



deletion. Parallel analysis of DNA from cell lines containing 
different numbers of X chromosomes (Fig. lb) y as we did before 
(7), demonstrated the sensitivity of our method to detect single- 
copy loss (45, XO), and 1.5- (47,XXX), 2- (48 T XXXX), or 
2.5-fold (49.XXXXX) gains (also see Fig. 5, which is published 
as supporting information on the PNAS web site). Fluorescence 
ratios were linearly proportional to copy number ratios, which 
were slightly underestimated, in agreement with previous ob- 
servations (7). Numerous DNA copy number alterations were 
evident in both the breast cancer cell lines and primary tumors 
(Fig. la), detected in the tumors despite the presence of euploid 
non-tumor cell types; the magnitudes of the observed changes 
were generally lower in the tumor samples. DNA copy-number 
alterations were found in every cancer cell line and tumor, and 
on every human chromosome in at least one sample. Recurrent 
regions of DNA copy number gain and loss were readily iden- 
tifiable. For example, gains within lq, 8q, 17q, and 20q were 
observed in a high proportion of breast cancer cell lines/tumors 
(9Q%/69%, 100%/47%, 100%/60%, and 90%/44%, respective- 
ly), as were losses within lp, 3p, 8p, and 13q (80%/24%, 
80%/22%, 80%/22%, and 70%/18%, respectively), consistent 
with published cytogenetic studies (refs. 2-4; a complete listing 
of gains/losses is provided in Tables 2 and 3, which are published 
as supporting information on the PNAS web site). The total 
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Fig. 2. DNA copy number alteration across chromosome 8 by array CGH. (a) DNA copy number profiles are illustrated for cell lines containing different numbers 
of X chromosomes, for breast cancer cell lines, and for breast tumors. Breast cancer cell lines and tumors are separately ordered by hierarchical clustering to 
highlight recurrent copy number changes. The 241 genes present on the microarrays and mapping to chromosome 8 are ordered by position along the 
chromosome. Fluorescence ratios (test /reference) are depicted by a looj pseudocolor scale (indicated). Selected genes are indicated with color-coded text (red, 
increased; green, decreased; black, no change; gray, not well measured) to reflect correspondingly altered mRNA levels (observed in the majority of the subset 
of samples displaying the DNA copy number change). The map positions for genes of interest that are not represented on the micro array are indicated in the 
row above those genes represented on the array, (b) Graphical display of DNA copy number profile for breast cancer cell tine SKBR3. Fluorescence ratios 
(tumor/normal) are plotted on a 1og 2 scale for chromosome 8 genes, ordered along the chromosome. 



number of genomic alterations (gains and losses) was found to 
be significantly higher in breast tumors that were high grade (P = 
0.008), consistent with published CGH data (3), estrogen recep- 
tor negative (P — 0.04), and harboring TP53 mutations (P - 
0.0006) (see Table 4, which is published as supporting informa- 
tion on the PNAS web site). 

The improved spatial resolution of our array CGH analysis is 
illustrated for chromosome 8, which displayed extensive DNA 
copy number alteration in our series. A detailed view of the 
variation in the copy number of 241 genes mapping to chromo- 
some 8 revealed multiple regions of recurrent amplification; 
each of these potentially harbors a different known or previously 
uncharacterized oncogene (Fig. 2a). The complexity of amplicon 
structure is most easily appreciated in the breast cancer cell line 
SKBR3. Although a conventional CGH analysis of 8q in SKBR3 
identified only two distinct regions of amplification (12), we 
observed three distinct regions of high-level amplification (la- 
beled 1-3 in Hg. 2b). For each of these regions we can define the 



boundaries of the interval recurrently amplified in the tumors we 
examined; in each case, known or plausible candidate oncogenes 
can be identified (a description of these regions, as well as the 
recurrently amplified regions on chromosomes 17 and 20, can be 
found in Figs. 6 and 7, which are published as supporting 
information on the PNAS web site). 

For a subset of breast cancer cell lines and rumors (4 and 37, 
respectively), and a subset of arrayed genes (6,095), mRNA 
levels were quantitatively measured in parallel by using cDNA 
microarrays ^8). The parallel assessment of mRNA levels is 
useful in the interpretation of DNA copy number changes. For 
example, the highly amplified genes that are also highly ex- 
pressed are the strongest candidate oncogenes within an ampli- 
con. Perhaps more significantly, our parallel analysis of DNA 
copy number changes and mRNA levels provides us the oppor- 
tunity to assess the global impact of widespread DNA copy 
number alteration on gene expression in tumor cells. 

A strong influence of DNA copy number on gene expression 
is evident in an examination of the pseudocolor representations 
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Fig. 3. Concordance between DNA copy number and gene expression across chromosome 1 7. DNA copy number alteration (Upper) and mRNA levels (Lower) 
are illustrated for breast cancer cell lines and tumors. Breast cancer cell lines and tumors are separately ordered by hierarchical clustering (Upper), and the 
identical sample order is maintained (Lowed- The 3S4 genes present on the micro arrays and mapping to chromosome 17. and for which both DNA copy number 
and mRNA levels were determined, are ordered by position along the chromosome; selected genes are indicated in color-coded text (see Fig. 2 legend). 
Fluorescence ratios (test/reference) are depicted by separate log* pseudocolor scales (indicated). 



of DNA copy number and mRNA levels for genes on chromo- 
some 17 (Fig. 3). The overall patterns of gene amplification and 
elevated gene expression are quite concordant; i.e., a significant 
fraction of highly amplified genes appear to be correspondingly 
highly expressed. The concordance between high-level amplifi- 
cation and increased gene expression is not restricted to chro- 
mosome 17. Genome-wide, of 117 high-level DNA amplifica- 
tions (fluorescence ratios >4, and representing 91 different 
genes), 62% (representing 54 different genes; see Table 5, which 
is published as supporting information on the PNAS web site) 
are found associated with at least moderately elevated mRNA 
levels (mean-centered fluorescence ratios >2), and 42% (rep- 
resenting 36 different genes) are found associated with compa- 
rably highly elevated mRNA levels (mean-centered fluorescence 
ratios >4). 

To determine the extent to which DNA deletion and lower- 
level amplification (in addition to high-level amplification) are 
also associated with corresponding alterations in mRNA levels, 
we performed three separate analyses on the complete data set 
(4 cell lines and 37 tumors, across 6,095 genes). First, we 
determined the average mRNA levels for each of five classes 
of genes, representing DNA deletion, no change, and low-, 
medium-, and high-level amplification (Fig. 4a). For both the 



breast cancer cell lines and tumors, average mRNA levels 
tracked with DNA copy number across all five classes, in a 
statistically significant fashion (P values for pair-wise Student's 
t tests comparing adjacent classes; cell lines, 4 X 10~ 49 , 1 x 10~ 49 , 
5 x 10" 5 , 1 X 10" 2 ; tumors, 1 X 10" 43 , 1 X 10" 234 , 5 X HT 41 , 
1 X 1G~ 4 ). A linear regression of the average log(DNA copy 
number), for each class, against average log(mRNA level) 
demonstrated that on average, a 2-fold change in DNA copy 
number was accompanied by 1.4- and 1.5-fold changes in mRNA 
level for the breast cancer cell lines and tumors, respectively (Fig. 
4a, regression line not shown). Second, we characterized the 
distribution of the 6,095 correlations between DNA copy num- 
ber and mRNA level, each across the 37 tumor samples (Fig. 46). 
The distribution of correlations forms a normal-shaped curve, 
but with the peak markedly shifted in the positive direction from 
zero. This shift is statistically significant, as evidenced in a plot 
of observed vs. expected correlations (Fig. 4c), and reflects a 
pervasive global influence of DNA copy number alterations on 
gene expression. Notably, the highest correlations between DNA 
copy number and mRNA level (the right tail of the distribution 
in Fig. 4b) comprise both amplified and deleted genes (data not 
shown). Third, we used a linear regression model to estimate the 
fraction of all variation measured in mRNA levels among the 37 
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tumors that could be attributed to underlying variation in DNA 
copy number. From this analysis, we estimate that, overall, about 
1% of ail of the observed variation in mRNA levels can be 
explained directly by variation in copy number of the altered 
genes (Fig. 4d). We can reduce the effects of experimental 
measurement error on this estimate by using only that fraction 
of the data most reliably measured (fluorescence intensity/ 
background >3); using that data, our estimate of the percent 
variation in mRNA levels directly attributed to variation in gene 
copy number increases to 12% (Fig. This still undoubtedly 
represents a significant underestimate, as the observed variation 
in global gene expression is affected not only by true variation in 
the expression programs of the tumor ceils themselves, but also 
by the variable presence of non-tumor cell types within clinical 
samples. 

Discussion 

This genome-wide, array CGH analysis of DNA copy number 
alteration in a series of human breast tumors demonstrates the 
usefulness of defining amplicon boundaries at high resolution 
(gene-by-gene), and quantitatively measuring amplicon shape, to 
assist in locating and identifying candidate oncogenes. By ana- 
lyzing mRNA levels in parallel, we have also discovered that 
changes in DNA copy number have a large, pervasive, direct 
effect on global gene expression patterns in both breast cancer 



cell lines and tumors. Although the DNA microarrays used in our 
analysis may display a bias toward characterized and/or highly 
expressed genes, because we are examining such a large fraction 
of the genome (approximately 20% of all human genes), and 
because, as detailed above, we are likely underestimating the 
contribution of DNA copy number changes to altered gene 
expression, we believe our findings are likely to be generalizable 
(but would nevertheless still be remarkable if only applicable to 
this set of -6,100 genes). 

In budding yeast, aneuploidy has been shown to result in 
chromosome-wide gene expression , biases (13). Two recent ,^ 
studies have begun to examine the global relationship between IfS^ 
DNA copy number and gene expression in cancer cells. In |l!|pi 
agreement with our findings, Phillips et aL (14) have shown that p|pp 
with the acquisition of tumorigenicity in an immortalized pros- 1*^1 
tate epithelial cell line, new chromosomal gains and losses |p^PI 
resulted in a statistically significant respective increase and 
decrease in the average expression level of involved genes. In 
contrast, Platzer et aL (15) recently reported that in metastatic 
colon tumors only —4% of genes within amplified regions were 
found more highly (>2-fold) expressed, when compared with 
normal colonic epithelium. This report differs substantially from 
our finding that 62% of highly amplified genes in breast cancer 
exhibit at least 2-foJd increased expression. These contrasting 
findings may reflect methodological differences between the 
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studies. For example, the study of Platzer et al. (15) may have 
systematically under-measured gene expression changes. In this 
regard it is remarkable that only 14 transcripts of many thousand 
residing within unampKfied chromosomal regions were found to 
exhibit at least 4-foJd altered expression in metastatic colon 
cancer. Additionally, their reliance on lower-resolution chromo- 
somal CGH may have resulted in poorly delimiting the bound- 
aries of high-complexity amplicons, effectively overcalling re- 
gions with amplification. Alternatively, the contrasting findings 
for amplified genes may represent real biological differences 
between breast and metastatic colon tumors; resolution of this 
issue will require further studies. 

Our finding that widespread DNA copy number alteration has 
a large, pervasive and direct effect on global gene expression 
patterns in breast cancer has several important implications. 
First, this finding supports a high degree of copy number- 
dependent gene expression in tumors. Second, it suggests that 
most genes are not subject to specific autoregulation or dosage 
compensation. Third, this finding cautions that elevated expres- 
sion of an amplified gene cannot alone be considered strong 
independent evidence of a candidate oncogene's role in tumor- 
igenesis. In our study, fully 62% of highly amplified genes 
demonstrated moderately or highly elevated expression. This 
highlights the importance of high-resolution mapping of ampli- 
con boundaries and shape [to identify the "driving" gene(s) 
within amplicons (16)], on a large number of samples, in addition 
to functional studies. Fourth, this finding suggests that analyzing 



the genomic distribution of expressed genes, even within existing 
microarray gene expression data sets, may permit the inference 
of DNA copy number aberration, particularly aneuploidy (where 
gene expression can be averaged across large chromosomal 
regions; see Fig. 3 and supporting information). Fifth, this 
finding implies that a substantial portion of the phenotypic 
uniqueness (and by extension, the heterogeneity in clinical 
behavior) among patients' tumors may be traceable to underly- 
ing variation in DNA copy number. Sixth, this finding supports 
a possible role for widespread DNA copy number alteration in 
tumorigenesis (17, 18), beyond the amplification of specific 
oncogenes and deletion of specific rumor suppressor genes. 
Widespread DNA copy number alteration, and the concomitant 
widespread imbalance in gene expression, might disrupt critical 
stochiometric relationships in cell metabolism and physiology 
(e.g., proteosome, mitotic spindle), possibly promoting further 
chromosomal instability and directly contributing to tumor 
development or progression. Finally, our findings suggest the 
possibility of cancer therapies that exploit specific or global 
imbalances in gene expression in cancer. 
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Each year, over 1 82,000 women in the United States are 
diagnosed with breast cancer, and approximately 45,000 die 
of the disease.* Incidence appears to be increasing in the 
United States at a rate of roughly 2% per year. The reasons 
for the increase are unclear, but non-genetic risk factors appear 
to play a large role. 2 

Five-year survival rates range from approximately 65%- 
85 %, depending on demographic group, with a significant 
percentage of women experiencing recurrence of their cancer 
within 10 years of diagnosis. One of the factors most predic- 
tive for recurrence once a diagnosis of breast cancer has been 
made is the number of axillary lymph nodes to which tumor 
has metastasized. Most node-positive women are given adju- 
vant therapy, which increases their survival. However, 20%- 
30% of patients without axillary node involvement also 
develop recurrent disease, and the difficulty lies in how to iden- 
tify this high-risk subset of patients. These patients could 
benefit from increased surveillance, early intervention, and 
treatment. 

Prognostic markers currently used in breast cancer recur- 
rence prediction include tumor size, histological grade, steroid 
hormone receptor status, DNA ploidy, proliferative index, and 
cathepsin D status. Expression of growth factor receptors and 
over-expression of the HER-2/neu oncogene have also been 
identified as having value regarding treatment regimen and 
prognosis. 

HER-2/neu (also known as c-erbB2) is an oncogene that 
encodes a transmembrane glycoprotein that is homologous 
to, but distinct from, the epidermal growth factor receptor. 
Numerous studies have indicated that high levels of expres- 
sion of this protein are associated with rapid tumor growth, 
certain forms of therapy resistance, and shorter disease-free 
survival. The gene has been shown to be amplified and/or 
overexpressed in 10%-30% of invasive breast cancers and in 
40%-60% of intraductal breast carcinoma. 3 

There are two distinct FDA-approved methods by which 
HER-2/neu status can be evaluated: immunohistochemistry 
(IHC, HercepTest™) and FISH (fluorescent in situ hybridiza- 
tion, Path Vysion™ KiO- Both methods can be performed on 
archived and current specimens. The first method allows visual 
assessment of the amount of HER-2/neu protein present on 
the cell membrane. The latter method allows direct quantifi- 
cation of the level of gene amplification present in the tumor, 
enabling differentiation between low- versus high-amp! ifica- 
tidh. At least one study has demonstrated a difference in 



recurrence risk in women younger than 40 years of age for 
low- versus high-amplified tumors (54.5% compared to 
85.7%); this is compared to a recurrence rate of 16.7% for 
patients with no HER-2/neu gene amplification. 4 HER-2/neu 
status may be particularly important to establish in women with 
small (< 1 cm) tumor size. 

The choice of methodology for determination of HER-2/ 
neu status depends in part on the clinical setting. FDA approval 
for the Vysis FISH test was granted based on clinical trials 
involving 1549 node-positive patients. Patients received one 
of three different treatments consisting of different doses of 
cyclophosphamide, Adriamycin, and 5-fluorouracil (CAF). 
The study showed that patients with amplified HER-2/neu 
benefited from treatment with higher doses of adriamycin - 
based therapy, while those with normal HER-2/neu levels did 
not The study therefore identified a sub-set of women, who 
because they did not benefit from more aggressive treatment, 
did not need to be exposed to the associated side effects. In 
addition, other evidence indicates that HER-2/neu amplifica- 
tion in node-negative patients can be used as an independent 
prognostic indicator for early recurrence, recurrent disease at 
any time and disease-related death. 5 Demonstration of HER- 
2/neu gene, amplification by FISH has also been shown to be 
of value in predicting response to chemotherapy in stage-2 
breast cancer patients. 

Selection of patients for Herceptin® (Trastuzumab) mono- 
clonal antibody therapy, however, is based upon demonstra- 
tion of HER-2/neu protein overexpression using HercepTest™. 
Studies using Herceptin® in patients with metastatic breast 
cancer show an increase in time to disease progression, 
increased response rate to chemotherapeutic agents and a small 
increase in overall survival rate. The FISH assays have not yet 
been approved for this purpose, and studies looking at response 
to Herceptin® in patients with or without gene amplification 
status determined by FISH are in. progress. 

In general, FISH and IHC results correlate well. However, 
subsets of tumors are found which show discordant results; 
i.e., protein overexpression without gene amplification or lack 
of protein overexpression with gene amplification. The clini- 
cal significance of such results is unclear. Based on the above 
considerations, HER-2/neu testing at SHMC/PAML will uti- 
lize immunohistochemistry (HercepTest^) as a screen, fol- 
lowed by FISH in IHC-negative cases. Alternatively, either 
method may be ordered individually depending on the clini- 
cal setting or clinician preference. 
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CPT code information 

HER-2/neuviaIHC 

88342 (including interpretive report) 

HER-2/neu via FISH 

8827 1 *2 Molecular cytogenetics; DNA probe, each 
88274 Molecular cytogenetics, interphase in situ hybrid- 
ization, analyze 25-99 cells 
8829 1 Cytogenetics and molecular cytogenetics, interpre- 
tation and report 



Procedural Information 

Irnmunohistochemistry is performed using the FDA-approved 
DA1CO antibody kit, Herceptest© The DAKO kit contains 
reagents required to complete a two-step immunohisto- 
chemical staining procedure forroutinely processed, paraffin- 
embedded specimens. Following incubation with the primary 
rabbit antibody to human HER-2/neu protein, the kit employs 
a ready-to-use dextran-based visualization reagent. This re- 
agent consists of both secondary goat anti-rabbit antibody 
molecules with horseradish peroxidase molecules linked to a 
common dexrran polymer backbone, thus eliminating the need 
for sequential application of link antibody and peroxidase 
conjugated antibody. Enzymatic conversion of the subse- 
quently added chromogen results in formation of visible 
reaction product at the antigen site, the specimen is then coun- 
terstained; a pathologist using light-microscopy interprets 

resTjUs. , 

PISH analysis at SHMC/PAML is performed using the 
FD A-approved Path Vysion™ HER-2/neu DNA probe kit, pro- 
duced by Vysis,lnc. Formalin fixed, paraffin-embedded breast 
tissue is processed using routine histological methods, and then 
slides are treated to allow hybridization of DNA probes to the 
nuclei present in the tissue section. The Pathvysion™ kit cori- 
taixis two direct-labeled DNA probes, one specific for the 
alphoid repetitive DNA (CEP 1 7, spectrum orange) present at 
the chromosome 17 centromere and the second for the HER- 
2/neu oncogene located at 17ql 1.2-12 (spectrum green). Enu- 
meration of the probes allows a ratio of the number of copies 
of chromosome 17 to the number of copies of HER-2/neu to 
be obtained; this enables quantification of low versus high 
amplification levels, and allows an estimate of the percentage 
0 r cells with HER-2/neu gene amplification. The clinically 
relevant distinction is whether the gene amplification is due, 
to increased gene copy number on the two chromosome 17 
homologues normally present or an increase in the number of 
chromosome 17s in the cells. In the majority of cases, ratio 
eauivalents less than 2.0 are indicative of a normal/negatrve 
result ratios of 2.1 and over indicate that amplification is 
present and to what degree. Interpretation of this data will be 
performed and reported from the Vysis-certified Cytogenet- 
ics laboratory at SHMC. 
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ABSTRACT Wnt family members are critical to many 
developmental processes, and components of the Wot signal- 
ing pathway have been linked to tumorigenesis in familial and 
sporadic colon carcinomas. Here we report the identification 
of two genes, WISP-1 and WISP-2, that are up-regulated in the 
mouse mammary epithelial cell line C57MG transformed by 
Wnt-1, but not by Wnt-4. Together with a third related gene, 
W7SP-3, these proteins define a subfamily of the connective 
tissue growth factor family. Two distinct systems demon- 
strated WISP induction to be associated with the expression of 
Wnt-l, These included (i) C57MG cells infected with a WnM 
retroviral vector or expressing Wnt-1 under the control of a 
tetracyiine repressible promoter, and (ii) Wnt-1 transgenic 
mice. The WISP-I gene was localized to human chromosome 
Sq24,l-8q243. WISP-J genomic DNA was amplified in colon 
cancer cell lines and in human colon tumors and its RNA 
overexpressed (2- to > 30-fold) in 84% of the tumors examined 
compared with patient-matched normal mucosa. WISP-3 
mapped to chromosome 6q22-6q23 and also was overex- 
pressed (4- to >40-fold) in 63% of the colon tumors analyzed. 
In contrast, WISP-2 mapped to human chromosome 20ql2— 
20qI3 and its DNA was amplified, but RNA expression was 
reduced (2- to >30-fold) in 79% of the tumors. These results 
suggest that the WISP genes may be downstream of Wnt-1 
signaling and that aberrant levels of WISP expression in colon 
cancer may play a role in colon tumorigenesis. 



Wnt-1 is a member of an expanding family of cysteine-rich, 
glycosylated signaling proteins that mediate diverse develop- 
mental processes such as the control of cell proliferation, 
adhesion, cell polarity, and the establishment of cell fates (1, 
2). Wnt-1 originally was identified as an oncogene activated by 
the insertion of mouse mammary tumor virus in virus-induced 
mammary adenocarcinomas (3, 4). Although Wnt-1 is not 
expressed in the normal mammary gland, expression of Wnt-1 
in transgenic mice causes mammary rumors (5). 

In mammalian cells, Wnt family members initiate signaling 
by binding to the seven- transmembrane spanning Frizzled 
receptors and recruiting the cytoplasmic protein Dishevelled 
(Dsn) to the cell membrane (1, 2, 6). Dsh then inhibits the 
kinase activity of the normally constitutively active glycogen 
synthase kinase-3/3 (GSK-3/3) resulting in an increase in 
g-catenin levels. Stabilized /3-catenin interacts with the tran- 
scription factor TCF/Lcfl, forming a complex that appears in 
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the nucleus and binds TCF/Lefl target DNA elements to 
activate transcription (7, 8)> Other experiments suggest that 
the adenomatous polyposis coli (APC) tumor suppressor gene 
also plays an important role in Wnt signaling by regulating 
0-catenin levels (9). APC is phosphorylated by GSK-3/3, binds 
to /3-catenin, arid facilitates its degradation. Mutations in 
either APC or j3-catenin have been associated with colon 
carcinomas and melanomas, suggesting these mutations con- 
tribute to the development of these types of cancer, implicating 
the Wnt pathway in tumorigenesis (1). 

Although much has been learned about the Wnt signaling 
pathway over the past several years, only a few of the tran- 
scriptionally activated downstream components activated by 
Wnt have been characterized. Those that have been described 
cannot account for all of the diverse functions attributed to 
Wnt signaling. Among the candidate Wnt target genes are 
those encoding the nodal-related 3 gene, Xnr3 y a member of 
the transforming growth factor (TGF)-)3 superfamily, and the 
homeobox genes, engraihd^goosecoid^ twin (Xtwn), ondsiamois 
(2). A recent report also identifies c-myc as a target gene of the 
Wnt signaling pathway (10). 

To identify additional downstream genes in the Wnt signal- 
ing pathway that are relevant to the transformed cell pheno- 
type, we used a PCR-based cDNA subtraction strategy, sup- 
pression subtractive hybridization (SSH) using RNA 
isolated from C57MG mouse mammary epithelial cells and 
C57MG cells stably transformed by a Wnt-1 retrovirus. Over- 
expression of Wnt-1 in this cell line is sufficient to induce a 
partially transformed phenotype, characterized by elongated 
and retractile cells that lose contact inhibition and form a 
multilayered array (12, 13). We reasoned that genes differen- 
tially expressed between these two cell lines might contribute 
to the transformed phenotype. 

In this paper, we describe the cloning and characterization 
of two genes up-regulated in Wnt-1 transformed cells, WISP-1 
and W1SP-2, and a third related gene, WISP-3. The WISP genes 
are members of the CCN family of growth factors, which 
includes connective tissue growth factor (CTGF), Cyr61, and 
nov, a family not previously linked to Wnt signaling. 

MATERIALS AND METHODS 

SSH. SSH was performed by using the PCR-Select cDNA 
Subtraction Kit (CLONTECH). Tester double-stranded 

Abbreviations: TGF, transforming growth factor; CTGF, connective 
tissue growth factor; SSH, suppression subtractive hybridization; 
VWC, von Willebrand factor type C module. 
Data deposition: The sequences reported in this paper have been 
deposited in the Genbank database (accession nos. AF1 00777 
AF100778, AF100779, AF100780, and AF100781). 
^To whom reprint requests should be addressed, e-mail: diane@gene. 
com. 
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cDNA was synthesized from 2 yxg of poly(A) + RNA isolated 
from the C57MG/Wnt-1 cell line and driver cDNA from 2 jxg 
of po!y(A) + RNA from the parent C57MG cells. The sub- 
tracted cDNA library was subcloned into a pGEM-T vector for 
further analysis. 

cDNA Library Screening* Clones encoding fall-length 
mouse WISP-1 were isolated by screening a AgtlO mouse 
embryo cDNA library (CLONTECH) with a 70-bp probe from 
the original partial clone 568 sequence corresponding to amino 
acids 128-169. Clones encoding full-length human WISP-1 
were isolated by screening AgtlO lung and fetal kidney cDNA 
libraries with the same probe at low stringency. Clones en- 
coding full-length mouse and human WISP-2 were isolated by 
screening a C57MG/ Wnt-1 or human fetal lung cDNA library 
with a probe corresponding to nucleotides 1463-1512. Full- 
length cDNAs encoding WlSP-3 were cloned from human 
bone marrow and fetal kidney libraries. 

Expression of Human WISP RNA. PCJR amplification of 
first-strand cDNA was performed with human Multiple Tissue 
cDNA panels (CLONTECH) and 300 uM of each dNTP at 
94°C for 1 sec, 62°C for 30 sec, 72°C for 1 min, for 22-32 cycles. 
WISP and glyceraldehyde-3 -phosphate dehydrogenase primer 
sequences are available on request. 

In Situ Hybridization, 33 P-labeled sense and antisense ribo- 
probes were transcribed from an 897-bp PCR product corre- 
sponding to nucleotides 601-1440 of mouse WISP-1 or a 
294-bp PCR product corresponding to nucleotides 82-375 of 
mouse WISP-2. AH tissues were processed as described (40). 

Radiation Hybrid Mapping. Genomic DNA from each 
hybrid in the Stanford G3 and Genebridge4 Radiation Hybrid 
Panels (Research Genetics, Huntsville, AL) and human and 
hamster control DNAs were PCR-amplified, and the results 
were submitted to the Stanford or Massachusetts Institute of 
Technology web servers. 

Cell Lines, Tumors, and Mucosa Specimens. Tissue speci- 
mens were obtained from the- Department of Pathology (Uni- 
versity of Pittsburgh) for patients undergoing colon resection 
and from the University of Leeds, United Kingdom. Genomic 
DNA was isolated (Qiagen) from the pooled blood of 10 
normal human donors, surgical specimens, and the following 
ATCC human cell lines: SW480, COLO 320DM, HT-29, 
WiDr, and SW403 (colon adenocarcinomas), SW620 (lymph 
node metastasis, colon adenocarcinoma), HCT 116 (colon 
carcinoma), SK-CO-1 (colon adenocarcinoma, ascites), and 
HM7 (a variant of ATCC colon adenocarcinoma cell line LS 
174T). DNA concentration was determined by using Hoechst 
dye 33258 intercalation fiuorimetry. Total RNA was prepared 
by homogenization in 7 M GuSCN followed by centrifugation 
over CsCl cushions or prepared by using RNAzol. 

Gene Amplification and RNA Expression Analysis. Relative 
gene amplification and RNA expression of WISPs andc-myc in 
the cell lines, colorectal tumors, and normal mucosa were 
determined by quantitative PCR. Gene-specific primers and 
fluorogenic probes (sequences available on request) were 
designed and used to amplify and quantitate the genes. The 
relative gene copy number was derived by using the formula 
2< Act ) where ACt represents the difference in amplification 
cycles required to detect the WISP genes in peripheral blood 
lymphocyte DNA compared with colon tumor DNA or colon 
tumor RNA compared with normal mucosal RNA. The 
3-method was used for calculation of the SE of the gene copy 
number or RNA expression level. The H^SP-specific signal was 
normalized to that of the gIyceraldehyde-3-pbosphate dehy- 
drogenase housekeeping gene. Ail TaqMan assay reagents 
were obtained from Pericin-Elmer Applied Biosystems. 

RESULTS 

Isolation of WISP-1 and WISP-2 by SSH. To identify Wnt- 
1 -inducible genes, we used the technique of SSH using the 



mouse mammary epithelial cell line C57MG and C57MG cells 
that stably express Wnt-1 (11). Candidate differentially ex- 
pressed cDNAs (1384 total) were sequenced. Thirty-nine 
percent of the sequences matched known genes or homo- 
logues, 32% matched expressed sequence tags, and 29% had 
no match. To confirm that the transcript was differentially 
expressed, semiquantitative reverse transcription-PCR and 
Northern analysis were performed by using mRNA from the 
C57MG and C57MG/WnM cells. 

Two of the cDNAs, WISP-1 and WISP-2, were differentially 
expressed, being induced in the C57MG/Wnt-1 cell line, but 
not in the parent C57MG cells or C57MG cells overexpressing 
Wnt-4 (Fig. 1 A and B). Wnt-4, unlike Wnt-1, does not induce 
the morphological transformation of C57MG cells and has no 
effect on 0-catenin levels (13, 14). Expression of WISP-1 was 
up-regulated approximately 3-fold in the C57MG/Wnt-1 cell 
line and WISP-2 by approximately 5-fold by both Northern 
analysis and reverse transcription-PCR. 

An independent, but similar, system was used to examine 
WISP expression after Wnt-1 induction. C57MG cells express- 
ing the Wnt-1 gene under the control of a tetracycline- 
repressible promoter produce low amounts of Wnt-1 in the 
repressed state but show a strong induction of Wnt-1 mRNA 
and protein within 24 hr after tetracycline removal (8). The 
levels of Wnt-1 and WISP RNA isolated from these cells at 
various times after tetracycline removal were assessed by 
quantitative PCR. Strong induction of Wnt-1 mRNA was seen 
as early as 10 hr after tetracycline removal. Induction of WISP 
mRNA (2- to 6-fold) was seen at 48 and 72 hr (data not shown). 
These data support our previous observations that show that 
WISP induction is correlated with Wnt-1 expression. Because 
the induction is slow, occurring after approximately 48 hr, the 
induction of WISPs may be an indirect response to Wnt-1 
signaling. 

cDNA clones of human WISP-1 were isolated and the 
sequence compared with mouse WISP-1. The cDNA sequences 
of mouse and human WISP- 1 were 1,766 and 2.830 bp in length, 
respectively, and encode proteins of 367 aa, with predicted 
relative molecular masses of ^40,000 (M T 40 K). Both have 
hydrophobic N -terminal signal sequences, 38 conserved cys- 
teine residues, and four potential N-linked glycosylation sites 
and are 84% identical (Fig. 2A)> 

Full-length cDNA clones of mouse and human WISP-2 were 
1,734 and 1,293 bp in length, respectively, and encode proteins 
of 251 and 250 aa, respectively, with predicted relative molec- 
ular masses of 27,000 (M x 27 K) (Fig. IE). Mouse and human 
WISP-2 are 73% identical. Human WISP-2 has no potential 
N-linked glycosylation sites, and mouse WISP-2 has one at 
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Fig. 1 . WISP-1 and WISP-2 are induced by Wnt-1 , but not Wnt-4, 
expression in C57MG cells. Northern analysis of WISP-1 {A) and 
WISP-2 (5) expression in C57MG, C57MG/Wnt-1, and C57MG/ 
Wnt-4 cells. Po3y(A) + RNA (2 ;*g) was subjected to Northern blot 
analysis and hybridized with a 70-bp mouse WISP- /-specific probe 
(amino acids 278-300) or a 190-bp W75/>-2-specific probe (nucleotides 
1438-1 627) in the 3' untranslated region. Blots were rehybridized with 
human £-actin probe. 
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Fro. 2. Encoded amino acid sequence alignment of mouse and 
human WISP-1 (A) and mouse and human WlSP-2 (&). The potential 
signai sequence, insulin-like growth factor-binding protein (IGF- BP), 
VWC, thrombospondm (TSP), and C-terminal (CT) domains are 
underlined. 

position 197. WISP-2 has 28 cysteine residues that are con- 
served among the 38 cysteines found in WISP-L 

Identification of WISF-3. To search for related proteins, we 
screened expressed sequence tag (EST) databases with the 
WISP-1 protein sequence and identified several ESTs as 
potentially related sequences. We identified a homologous 
protein that we have called WISP-3. A full-length human 
WISP-3 cDNA of 1,371 bp was isolated corresponding to those 
ESTs that encode a 354-aa protein with a predicted molecular 
mass of 39,293. WISP-3 has two potential N-linked glycosyl- 
ation sites and 36 cysteine residues. An alignment of the three 
human WISP proteins shows that WISP-1 and WISP-3 are the 
most similar (42% identity), whereas WISP-2 has 37% identity 
with WISP-1 and 32% identity with WISP-3 (Fig. 3/1), 

WISPs Are Homologous to the CTGF Family of Proteins. 
Human WISP-1, WISP-2, and WISPS are novel sequences; 
however, mouse WISP-1 is the same as the recently identified 
Elm! gene. Elml is expressed in low, but not high, metastatic 
mouse melanoma cells, and suppresses the in vivo growth and 
metastatic potential of K-1735 mouse melanoma cells (15). 
Human and mouse WISP-2 are homologous to the recently 
described rat gene, rCop-1 (16), Significant homology (36- 
44%) was seen to theCCN family of growth factors. This family 
includes three members, CTGF, Cyr61, and the protoonco- 
gene nov, CTGF is a chemotactic and mitogen ic factor for 
fibroblasts that is implicated in wound healing and fibrotic 
disorders and is induced by TGF-/3 (17). Cyr61 is an extracel- 
lular matrix signaling molecule that promotes cell adhesion, 
proliferation, migration, angiogenesis, and tumor growth (18, 
19). nov (nephroblastoma overexpressed) is an immediate 
early gene associated with quiescence and found altered in 
Wilms tumors (20). The proteins of the CCN family share 
functional, but not sequence, similarity to WnM. All are 
secreted, cysteine-rich heparin binding glycoproteins that as- 
sociate with the cell surface and extracellular matrix. 

WISP proteins exhibit the modular architecture of the CCN 
family, characterized by four conserved cysteine-rich domains 
(Fig. 3£) (21). The N-termmal domain, which includes the first 
12 cysteine residues, contains a consensus sequence (GCGC- 
CXXC) conserved in most insulin-like growth factor (IGF)- 
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Fig. 3. (-^) Encoded amino acid sequence alignment of human 
WISPs. The cysteine residues of WISP-1 and WISP-2 that are not 
present in WISP-3 are indicated with a dot. (B) Schematic represent 
lation of the WISP proteins showing the domain structure and cysteine 
residues (vertical lines). The four cysteine residues in the VWC domain 
that are absent in WISP-3 are indicated with a dot. (C) Expression of 
WISP mRNA in human tissues. PCR was performed on human 
multiple-Ussue cDNA panels (CLONTECH) from the indicated adult 
and fetal tissues. 

binding proteins (BP). This sequence is conserved in WISP-2 
and WISP-3, whereas WISP-1 has a glutamine in the third 
position instead of a glycine. CTGF recently has been shown 
to specifically bind IGF (22) and a truncated nov protein 
lacking the IGF-BP domain is oncogenic (23). The von Wil- 
lebrand factor type C module (VWC), also found in certain 
collagens and mucins, covers the next 10 cysteine residues, and 
is thought to participate in protein complex formation and 
oligomerization (24). The VWC domain of WISP-3 differs 
from all CCN family members described previously, in that it 
contains only six of the 10 cysteine residues (Fig. 3 A and B). 
A short variable region follows the VWC domain. The third 
module, the thrombospondm (TSP) domain is involved in 
binding to sulfated glycoconjugates and contains six cysteine 
residues and a conserved WSxCSxxCG motif first identified in 
thrombospondin (25). The C-terminal (CT) module contain- 
ing the remaining 10 cysteines is thought to be involved in 
dimerization and receptor binding (26). The CT domain is 
present in all CCN family members described to date but is 
absent in WISP-2 (Fig. 3 A and J5). The existence of a putative 
signal sequence and the absence of a transmembrane domain 
suggest that WISPs are secreted proteins, an observation 
supported by an analysis of their expression and secretion from 
mammalian cell and baculovirus cultures (data not shown). 

Expression of WISP mRNA in Human Tissues. Tissue- 
specific expression of human WISPs was characterized by PCR 
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analysis on adult and fetal multiple tissue cDNA panels. 
WISP- J expression was seen in the adult heart, kidney, lung, 
pancreas, placenta, ovary, small intestine, and spleen (Fig. 3C). 
Little or no expression was detected in the brain, liver, skeletal 
muscle, colon, peripheral blood leukocytes, prostate, testis, or 
thymus. WISP-2 had a more restricted tissue expression and 
was detected in adult skeletal muscle, colon, ovary, and fetal 
lung. Predominant expression of WISPS was seen in adult 
kidney and testis and fetal kidney. Lower levels of WISPS 
expression were detected in placenta, ovary, prostate, and 
small intestine. 

In Situ Localization of WISP-1 and WISP-2, Expression of 
WISP-1 and WISP-2 was assessed by in situ hybridization in 
mammary tumors from Wnt-1 transgenic mice. Strong expres- 
sion of WISP- 1 was observed in stromal fibroblasts lying within 
the fibrovascular tumor stroma (Fig. 4 A-D). However, low- 
level WISP-1 expression also was observed focally within tumor 
cells (data not shown). No expression was observed in normal 
breast. Like WISP-1, WISP-2 expression also was seen in the 
tumor stroma in breast tumors from Wnt-1 transgenic animals 
(Fig. 4 E-H). However, WISP-2 expression in the stroma was 
in spindle-shaped cells adjacent to capillary vessels, whereas 
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FfG. 4. {A t C, £, and (7) Representative herrtatoxylin/eosin-stained 
images from breast tumors in Wnt-1 transgenic mice. The correspond- 
ing dark-field images showing WISP- 1 expression are shown in 5 and 
D. The tumor is a moderately well-differentiated adenocarcinoma 
showing evidence of adenoid cystic change. At low power (A and B), 
expression of WISP-I is seen in the delicate branching fibrovascular 
tumor stroma (arrowhead). At higher magnification, expression is seen 
in the stromal(s) fibroblasts (C and £>), and tumor cells are negative. 
Focal expression of WISP-I* however, was observed in tumor cells in 
some areas. Images of WISP-2 expression are shown in £-//. At low 
power (E and F), expression of WISP-2 is seen in cells lying within the 
fibrovascular tumor stroma. At higher magnification, these cells 
appeared to be adjacent to capillary vessels whereas tumor cells are 
negative (G and II). 



the predominant cell type expressing WISP-1 was the stromal 
fibroblasts. . 

Chromosome Localization of the WISP Genes. The chro- 
mosomal location of the human WISP genes was determined 
by radiation hybrid mapping panels. WISP-I is approximately 
3.48 cR from the meiotic marker AFM259xc5 [logarithm of 
odds (lod) score 16.3 1] on chromosome 8q24.1 to 8q24.3, in the 
same region as the human locus of the novH family member 
(27) and roughly 4 Mbs distal to c-rnyc (28). Preliminary fine 
mapping indicates that WISP-1 is located near D8S1712 STS. 
WISP-2 is linked to the marker SHGC-33922 (lod = 1,000) on 
chromosome 20ql2-20ql3.1. Human WISPS mapped to chro- 
mosome 6q22-6q23 and is linked to the marker AFM211ze5 
(iod = 1,000). WISPS is approximately 18 Mbs proximal to 
CTGF and 23 Mbs proximal to the human cellular oncogene 
MYB (27, 29). 

Amplification and Aberrant Expression of WISPs in Human 
Colon Tumors, Amplification of protooncogenes is seen in 
many human tumors and has etiological and prognostic sig- 
nificance. For example, in a variety of tumor types, c-myc 
amplification has been associated with malignant progression 
and poor prognosis (30). Because WISP-1 resides in the same 
general chromosomal location (8q24) as c-myc, we asked 
whether it was a target of gene amplification, and, if so, 
whether this amplification was independent of the c-mye locus. 
Genomic DNA from human colon cancer cell lines was 
assessed by quantitative PCR and Southern blot analysis. (Fig. 
5 A and B). Both methods detected similar degrees of WISP-1 
amplification. Most cell lines showed significant (2- to 4-fold) 
amplification, with the HT-29 and WiDr cell lines demonstrat- 
ing an 8-fold increase. Significantly, the pattern of amplifica- 
tion observed did not correlate with that observed for c-myc, 
indicating that the c-myc gene is not part of the amplicon that 
involves the WISP-1 locus. 

We next examined whether the WISP genes were amplified 
in a panel of 25 primary human colon adenocarcinomas. The 
relative WISP gene copy number in each colon tumor DNA 
was compared with pooled normal DNA from 10 donors by 
quantitative PCR (Fig. 6). The copy number of WISP-1 and 
WISP-2 was significantly greater than one, approximately 
2-fold for WISP-1 in about 60% of the tumors and 2- to 4-fold 
for WISP-2 in 92% of the tumors (P < 0.001 for each). The 
copy number for WISPS was indistinguishable from one (P — 
0.166). In addition, the copy number of WISP-2 was signifi- 
cantly higher than that of WISP-1 (P < 0.001), 

The levels of WISP transcripts in RNA isolated from 19 
adenocarcinomas and their matched normal mucosa were 
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Fig. 5. Amplification of WISP- 1 genomic DNA in colon cancer cell 
lines, (/i) Amplification in cell line DNA was determined by quanti- 
tative PCR. (B) Southern blots containing genomic DNA (10 /i,g) 
digested with EcoRT (WISP-I) or Xbal (c-myc) were hybridized with 
a 100-bp human WISP-1 probe (amino acids 186-219) or a human 
c-myc probe (located at bp 1901-2000). The WISP and myc genes are 
detected in normal human genomic DNA after a longer film exposure. 
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Fig. 6. Genomic amplification of [-TOP genes in human colon 
tumors. The relative gene copy number of the WISP genes in 25 
adenocarcinomas was assayed by quantitative PGR, by comparing 
DNA from primary human tumors with pooled DNA from 10 healthy 
donors. The data are means ± SEM from one experiment done in 
triplicate. The experiment was repeated at least three times. 

assessed by quantitative PCR (Fig. 7). The level of WISP-! 
RNA present in tumor tissue varied but was significantly 
increased (2- to->25-fold) in 84% (16/19) of the human colon 
tumors examined compared with normal adjacent mucosa. 
Four of 19 tumors showed greater than 10-fold overexpression. 
In contrast, in 79% (15/19) of the tumors examined, WISP-2 
RNA expression was significantly lower in the tumor than the 
mucosa. Similar to WISP-1, WISP-3 RNA was overexpressed in 
63% (12/19) of the colon tumors compared with the normal 
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FiO. 7. #75/* RNA expression in primary human colon tumors 
relative to expression in normal mucosa from the same patient. 
Expression of WISP mRNA in 19 adenocarcinomas was assayed by 
quantitative PCR. The Dukes stage of the tumor is listed under the 
sample number. The data are means ± SEM from one experiment 
done in triplicate. The experiment was repeated at least twice. 



mucosa. The amount of overexpression of WISPS ranged from 
4- to >40-fold. 



DISCUSSION 

One approach to understanding the molecular basis of cancer 
is to identify differences in gene expression between cancer 
cells and normal cells. Strategies based on assumptions that 
steady-state mRNA levels will differ between normal and 
malignant cells have been used to clone differentially ex- 
pressed genes (31). We have used a PCR-based selection 
strategy, SSH, to identify genes selectively expressed in 
C57MG mouse mammary epithelial cells transformed by 
Wnt-L 

Three of the genes isolated, WISP-1, WISP-2, and WISPS, 
axe members of the CCN family of growth factors, which 
includes CTGF, Cyr61, and kov, a family not previously linked 
to Wnt signaling. 

Two independent experimental systems demonstrated that 
WISP induction was associated with the expression of Wnt-L 
The first was C57MG cells infected with a Wnt-1 retroviral 
vector or C57MG cells expressing Wnt-1 under the control of 
a tetracyline-repressible promoter, and the second was in 
Wnt-1 transgenic mice, where breast tissue expresses Wnt-1, 
whereas normal breast tissue does not. No WISP RNA expres- 
sion was detected in mammary tumors induced by polyoma 
virus middle T antigen (data not shown). These data suggest 
a link between Wnt-1 and WISPs in that in these two situations, 
WISP induction was correlated with Wnt-1 expression.* 

It is not clear whether the WISPs are directly or indirectly 
induced by the downstream components of the Wnt-1 signaling 
pathway (i,e., /3-catenin-TCF-l/Lefl). The increased levels of 
WISP RNA were measured in Wnt-l-transformed cells, hours 
or days after Wnt-1 transformation. Thus, WISP expression 
could result from Wnt-1 signaling directly through p-catenin 
transcription factor regulation or alternatively through Wnt-1 
signaling turning on a transcription factor, which in turn 
regulates WISPs. 

The WISPs define an additional subfamily of the CCN family 
of growth factors. One striking difference observed in the 
protein sequence of WISP-2 is the absence of a CT domain, 
which is present in CTGF, Cyr61, nov, WISP-1, and W1SP-3. 
This domain is thought to be involved in receptor binding and 
dimerization. Growth factors, such as TGF-/3, platelet-derived 
growth factor, and nerve growth factor, which contain a cystine 
knot motif exist as dimers (32). It is tempting to speculate that 
WISP-1 and W1SP-3 may exist as dimers, whereas WISP-2 
exists as a monomer. If the CT domain is also important for 
receptor binding, WISP-2 may bind its receptor through a 
different region of the molecule than the other CCN family 
members. No specific receptors have been identified for CTGF 
or nov. A recent report has shown that inte-grin a^fa serves as 
an adhesion receptor for Cyr61 (33). 

The strong expression of WISP-1 and WISP-2 in cells lying 
within the fibrovascular tumor stroma in breast tumors from 
Wnt-1 transgenic animals is consistent with previous obser- 
vations that transcripts for the related CTGF gene are pri- 
marily expressed in the fibrous stroma of mammary tumors 
(34). Epithelial cells are thought to control the proliferation of 
connective tissue stroma in mammary tumors by a cascade of 
growth factor signals similar to that controlling connective 
tissue formation during wound repair. It has been proposed 
that mammary tumor cells or inflammatory cells at the tumor 
interstitial interface secrete TGF-/31, which is the stimulus for 
stromal proliferation (34). TGF-/31 is secreted by a large 
percentage of malignant breast tumors and may be one of the 
growth factors that stimulates the production of CTGF and 
WISPs in the stroma. 

It was of interest that WISP-1 and WISP-2 expression was 
observed in the stromal cells that surrounded the tumor cells 
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(epithelial cells) in the Wnt-1 transgenic mouse sections of 
breast tissue. This finding suggests that paracrine signaling 
could occurin which the stromal cells could supply WISP-1 and 
WTSP-2 to regulate tumor cell growth on the WISP extracel- 
lular matrix. Stromal cell-derived factors in the extracellular 
matrix have been postulated to play a role in tumor cell 
migration and proliferation (35). The localization of WISP-1 
and WISP-2 in the stromal cells of breast tumors supports this 
paracrine model. 

An analysis of WISP- 1 gene amplification and expression in 
human colon tumors showed a correlation between DNA 
amplification and overexpression, whereas overexpression of 
WISP-3 RNA was seen in the absence of DNA amplification, 
Jn contrast, WISP-2 DNA was amplified in the colon tumors, 
but its mRNA expression was significantly reduced in the 
majority of tumors compared with the expression in normal 
colonic mucosa from the same patient. The gene for human 
WISP-2 was localized to chromosome 20ql2-20ql3, at a region 
frequently amplified and associated with poor prognosis in 
node negative breast cancer and many colon cancers, suggest- 
ing the existence of one or more oncogenes at this locus 
(36-38). Because the center of the 20ql3 amplicon has not yet 
been identified, it is possible that the apparent amplification 
observed for WISP-2 may be caused by another gene in this 
amplicon. 

A recent manuscript on rCop-1, the rat orthologue of 
WISP-2, describes the loss of expression of this gene after cell 
transformation, suggesting it may be a negative regulator of 
growth in cell lines (16). Although the mechanism by which 
WISP-2 RNA expression is down-regulated during malignant 
transformation is unknown, the reduced expression of WISP-2 
in colon tumors and cell lines suggests that it may function as 
a tumor suppressor. These results show that the WISP genes 
are aberrantly expressed in colon cancer and suggest that their 
altered expression may confer selective growth advantage to 
the tumor. 

Members of the Wnt signaling pathway have been impli- 
cated in the pathogenesis of colon cancer, breast cancer, and 
melanoma, including the tumor suppressor gene adenomatous 
polyposis coli and /3-catenin (39). Mutations in specif ic regions 
of either gene can cause the stabilization and accumulation of 
cytoplasmic /3~catenin, which presumably contributes to hu- 
man carcinogenesis through the activation of target genes such 
as the WISPs. Although the mechanism by which Wnt-1 
transforms cells and induces turnorigenesis is unknown, the 
identification of WISPs as genes that may be regulated down- 
stream of Wnt-1 in C57MG cells suggests they could be 
important mediators of Wnt-1 transformation. The amplifica- 
tion and altered expression patterns of the WISPs in human 
colon tumors may indicate an important role for these genes 
in tumor development. 
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ABSTRACT . The consistent cytogenetic translocation of 
chronic myelogenous leukemia (the Philadelphia chromosome, 
Ph 1 ) has been observed In cells of multiple hematopoietic 
lineages. This translocation creates a chimeric gene composed 
of breakpoint-cluster-region (bcr) sequences from chromosome 
22 fused to. a portion of the abl oncogene on chromosome 9, The 
resulting gene product (P210 c * abI ) resembles the transforming 
protein of the Abelson murine leukemia virus in Its structure 
and tyrosine kinase activity, p210 c "* u is expressed in Ph r - 
positive cell lines of myeloid lineage and in clinical specimens 
with myeloid. predominance. We show here that Epsteln-Barr 
virus-transformed B -lymphocyte ' lines that, retain Ph 1 can 
express P210 c * tbI . The level of expression in these B-cell lines is 
generally lower and more variable than that observed for 
myeloid lines. Protein expression is not related to amplification 
of the abl gene but to variation in the level of bcr-abl mRNA 
. produced from a single Ph 1 template. 

Chronic myelogenous leukemia (CML) is a disease of the 
pluripotent stem celi (1). In greater than 95% of patients, the 
leukemic cells contain the cytogenetic marker known as the 
Philadelphia chromosome, or Ph 1 (2). This - reciprocal 
translocation event between the long arms of chromosomes 
9 and 22 has been used as a disease- specific marker for 
diagnosis and evaluation of therapy. Multiple hematopoietic 
lineages, including myeloid and B -lymphoid, contain Ph 1 in 
early or chronic phase, as well as in the more acute accel- 
erated and blast crisis phases of the disease, ■ '. 

One molecular consequence of Ph* is the translocation of 
the chromosomal arm containing the c-abl gene oh chromo- 
some 9 into the middle of the breakpoint-cluster region (bcr) 
gene on chromosome 22 (3-6).' Although the precise 
translocation breakpoints are variable, an .RNA-splicing 
mechanism generates a very similar 8-kilobase (kb) mRNA in 
each case (5-9). The- hybrid bcr-abl message encodes a 
structurally altered form of the abl oncogene product, called 
P2i(jc-ftbi (10-13), with an amino-terininal segment derived 
from a portion of the exbns of bcr on chromosome 22 and a* 
carboxyl : terminaI segment derived from a major portion of 
the exohs of the c-abl gene on chromosome 9. The chimeric 
structure of bcr-abl and the resulting .P210 < ;' ftbl is similar to the 
structure of the Abelson murine leukemia ; virus gag-abl 
genome and resulting P160T"* b ! transforming gene product. 
Both proteins have very similar tyrosine kinase activities (10, 
11, 14) which can be distinguished by their relative stability 
to denaturing detergents and by thejr ATP requirements from 
the recently described tyrosine kinase activity of. the c-abl 
gene product (15). • 
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In concert with structural modification of the amino- 
terminal portion of the abl gene, increased level of expression 
has been implicated in activation of c-abl oncogenic poten- 
tial. Myeloid and erythroid cell lines and clinical samples 
derived from acute-phase CML patients contain about 10- 
fold higher levels of the &-\ib bcr-abl mRNA' and P210 c - aW than 
the c-abl mRNA forms (6 and 7 kb) and Pl45 cabl gene product 
(5, .8, 9, li).. The higher level of! expression of the chimeric 
bcr r abl message in acute-phase cells is not likely to be solely 
due to the presence of the bcr promoter sequences at the 5' 
end of the gene, since the normal 4.5rkb and 6.7;kb bcr- 
encoded mRNA. species are expressed at an even lower level 
than the normal c-abl messages (5, 6).. 

We have analyzed a series of.Epstein-Barr virus-immor- 
talized B-lymphpid pell lines derived from CML patients (16). 
With such in vitro clonal cell lines, we can evaluate whether 
the presence of fch 1 always results in synthesis of the chimeric 
bcr-abl message and protein, and whether the quantitative 
expression varies for cells of B-lymphoid lineage as com- 
pared to previously examined myeloid cell lines. Our results 
show that cell lines that retain Ph* do express bcr-abl message 
and protein, but that the level is generally lower and more 
variable than previously seen for myeloid" cell lines. The 
demonstration that the Ph 1 chromosomal template can vary 
in its level of expression of P210 c "*^ suggests that secondary 
mechanisms, beyond the translocation itself, contribute to 
the regulation of the bcr-abl gene in different cell types or 
subclones that derive from the affected stem cell. 

* ■ * * • 

MATERIALS AND METHODS 

Cells and Cell Landings. Epstein-Barr virus-transformed 
B-lymphoid cell lines were established from peripheral blood 
samples of chronic- and acute-phase CML patients as report* 
ed (16). The cell lines are designated according to patient 
number, karyotype, . and lineage For ■ example, SK- 
CML7Bt(9,22)-33 refers to CML patient 7, B-lymphoid cell 
line, 9;22 translocation (Ph 1 ), cell line 33; and SK-CML7BN- 
2 refers to B-cell line 2 with a normal karyotype derived from 
the same patient. Repeat karyotype analysis was performed 
to verify the retention of Ph 1 just prior to analysis for abl 
protein and RNA. Ceils were -maintained in RPMI .1640 
medium with 20% fetal bovine serum. We have not observed 
any consistent pattern of in vitro growth rate that correlates 
to the stage of disease at the time of transformation with 
Epstein-Barr virus. Cells (1.5 x lO^were washed twice with 
Dulbecco's modified Eagle* s medium lacking phosphate and 

Abbreviations: 5cr,„ breakpoint-cluster region; CML, chronic 
myelogenous leukemia; kb, kilobase(s). 

'^Present address: Department of Genetics, University of Washing- 
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supplemented with 5% dialyzed fetal bovine serum. Cells 
were then resuspended in 2 ml of the minimal medium. 
Labeling was started with the addition of [ 32 P]orthophos- 
phate (1 mCi/ml; ICN; 1 Ci = 37 GBq) and continued at 37°C 
for 3-4 hr. 

Imrnunopredpitation and Immunoblotting.. Immunoprecip- 
itations were carried out as described (10). Cells (1.5 X J.0 7 ) 
were washed with phosphate-buffered saline and extracted 
with 3-5 ml of phosphate lysis buffer (1% Triton X-100/0.1 
NaDodSO 4 /0.5% deoxycholate/10 mM Na 2 HPO<, pH 7.5/ 
100 mM NaCl) with 5 mM EDTA and 5 mM phenylmethyl- 
sulfpnyl fluoride. Extracts were clarified by centrifugation 
and precipitated with normal or rabbit anti-abl sera (anti- 
pEX-2 or anti-pEX-5) (17). The precipitated proteins were 
electrophoresed in a NaDodS0 4 /8% polyacrylamide gel. 
32 P-labeled proteins were detected by autoradiography. 
Alternatively, abi proteins were detected by immunoblotting. 
Extracts from unlabeled ceils were clarified, and proteins 
were concentrated by imrnunoprecipitation with rabbit anti- 
sera against <z6/-encoded proteins [anti-pEX-2 and anti-pEX- 
5 combined (17)] and then fractionated in 8% acrylamide gels. 
The proteins were transferred from the gel to nitrocellulose 
filters, using protease-facilitated transfer (18). The abi- 
encoded protein's were detected using murine monoclonal 
antibodies as a probe and peroxidase-conjugated goat anti- 
mouse second stage antibody (Bto-Rad) for development. 
Rabbit antisera and mouse monoclonal antibodies to abi 
proteins were prepared using bacterially expressed regions of 
the v-abl protein as immunogens (17, 19). Anti-pEX-2 anti- 
bodies react with the internal tyrosine kinase domain and 
anti-pEX-5 antibodies react with the carboxyl-terrninai seg- 
ment of the abi proteins. 

RNA Analysis. RNA was extracted from 10 B cells by the 
NaDodSOx/urea/phenol method (20). Poiyadenylylated 
RKA was purified by oligo(dT) affinity chromatography. 
Samples were electrophoresed in a 1% agarose/formalde- 
hyde gel and transferred to nitrocellulose, abi RNA species 
were detected by hybridization with a nick-translated v-abl 
fragment probe (21), 

DNA Analysis. DNA was prepared from 5 x 10 7 cells of 
each cell line and processed for Southern blots with a v-abl 
probe as described (21). 

RESULTS 

Variable Levels of P210 e aW Are Detected in Ph l -Positive Cell 
Lines* Ph^positive and Ph ! -negative, Epstein-Barr virus- 
transformed B -lymphocyte cell lines derived from the same 
patient were examined for P210 c ^ w synthesis by imrnuno- 
precipitation of [ 3i PJorthophosphate-labeled cell extracts 
with anti-abl sera (Fig. 1). The normal c-aW protein P145 c ** b1 
was detected at a similar level in multiple Ph l -positive and 
Ph 1 -negative cell lines. PllO 6 "* 1 * 1 was only detected in the 
Propositi ve cell lines because the bcr-abf chimeric gene 
which encodes P210 c - abI resides on the Ph 1 (4/5, 11, 13). The 
level of P210 c ' abl was about 4- to 5-fold higher than the level 
of pi45«w in tnc SK-CML7Bt-33 cell line (Fig. 1A, +). The 
Ph l -positive erythroid-progenitor cell line K562 (C) showed 
a level of FZliP** about 10-fold higher than P145 c "* bl . 
However, the level of P2l0 Mlbl was about one-fifth that of 
P145 c * abl in the. Ph l -positiye SK-CML16BM ceU line (Fig. U5, 
+). Comparison of different autoradiographic exposures 
roughly indicated that the level of P210 c * w varies over a 
2CKfold range between these Ph^positive B-cell lines.. Anal- 
ysis of four additional Ph L -po si tive B-cell lines demonstrated 
that the level of P210 c '* w fell into two general classes; some 
cell lines had a level of P210 c * bI similar to SK-CML7Bt-33 
and others had the low level similar to SK-CML16BM (Table 
1). This differs from previous studies with Ph^positive 
myeloid cell lines and patient samples derived from acute- 
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Fio. 1. Detection of variable levels of P210 p * bl in Ph'-positive 
B-cell lines. Production of PMS^ and P210 c « w in Epstein-Barr 
virus-tlransformed B-cell lines derived from a blast-crisis (A) and a 
chronic-phase (£) CML patient was examined by metabolic labeling 
with [ 32 P]orthophosphate and imrnunoprecipitation. Ph*-negative 
(-) and Ph l -positive (+) cell lines derived from each patient were 
analyzed. The Fh^negative cell line in A,- is SK-CML7BN-2 and in 
B- is SK-CML16BN-1. The Ph l -positive cell line in A,+ is 
SK-CML7Bt-33 and in £.+ is SK-CML16Bt-l, The K562 cell line, a 
Prepositive crythroid progenitor cell line spontaneously derived 
from a blast-crisis patient (33), is represented in C. Cells (1.5 x 10 7 ) 
were metabolically labeled with 2 mCi of [ 32 P]orthpphosphate for 3-4 
hr and then were extracted and clarified by centrifugation. Samples 
were immunpprecipitated with control normal serum Qanes l), 
anti-pEX-2 Qanes 2), or anti-pEX-5 Gene's 3) and analyzed by 
NaDodSO^/8% PAGE followed by autoradiography with an inten- 
sifying screen (3 days for A and C, 10 days for B). 

phase CML patients, in which P210 c ' abl was detected at a 
10-fold higher level than P145 c * >w (reis. 10 and 11; Table 1). 
There was no large difference in level of chimeric mRNA and 
P210 c "* bl expressed in four myeloid/erythroid-lineage Ph 1 - 
positive cell lines (K562, EM2, EM3, Q£L22, and BV173; 
refs. 9 and 11), despite a 4- to 5-fold amplification of 
aM-related sequences in the K562 cell line. 
. Detection of different levels of P210 c - aW in Fig. 1 could be 
due to decreased phosphorylation of P210 c " ?bl , a lower level 
of P210 c ' abl synthesis, or altered stability of the protein. To 
help distinguish among these possibilities, the steady-state 
level of P210 c " aM in the cell lines was assayed by immuno- 
blotting. The results show that SK-CML7Bt-r33 (Fig. 2A, +) 
had a higher level of P210 c ** bl than P145, similar to the results 
with metabolic labeling (Fig. 1), We did not detect P210 c " mbl 
by immunoblotting with 2 x 10 7 cells of line SK-CML8Bt-3 
(Fig. 2B, +). Reconstruction experiments using dilutions of 
cell extracts showed -that we could detect about 5-10% the 
level of P210 o * ftW expressed in the K562 cell line (data not 
shown). We infer that the steady-state level of P210 c * abl in 
SK-CML8Bt-3 is lower than the level in SK-CML7Bt-33 by 
a factor of at least 10. The level of P210 c_ * bl detected in these 
assays correlated with the amount of P210 c " abl tyrosine kinase 
activity that could be detected in vitrp (data not shown). 

Different Levels of P210 c * Eb1 Are Reflected in the Amount of 
Stable bcr^abl mRNA. To identify the basis for detection of 
variable levels of P210 c " abI > we examined the production of 
the abi RNA. RNA blot hybridization analysis using a v-abl 
probe (Fig. 3) showed that the normal 6- and 7-kb c-abl 
mRNAs were present at a similar level in Ph 1 -positive and 
-negative cell lines derived from different patients. However, 
the 8-kb mRNA that encodes P210 c_abl was detected at a 
Infold higher level in SK-CML7Bt-33 (Fig. 3A f +) than in 
SK-CML16Bt-l (B, +), which correlated with the' relative 
level of P210 Mlbl detected in- each cell line. Analysis of 
additional cell lines demonstrated that the level of 8-kb RNA 
directly correlated with the level of P210 c ' aW (Table 1). The 
variation in level of 8-kb RNA detected in these cell lines was 
not due to Io$s or gain of Ph 1 , because cytogenetic analysis 
confirmed the presence' of Ph 1 in these cell lines (ref. 16 and 
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Table 1. Relative levels of bcr-abl expression in Epsteiih-Barr. 
virus-imrnortalized B-cell lin es and myeloid CML lines • ' 
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SK-CML16Bt-l 


Chronic 


+ 
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SK-CML35Bt-2 . 


Chronic 
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K562 


BC 




++++.+ 


■ •+++++ 


BV173 


BC 


. + . 


++ +++ 


+++'+ + 


EM2 


. BC 


+ 


+++++ 
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*CelI lines derived -from CML patients by transformation with 
Epstein-Barr virus as described (16)- Names of cell lines indicate 
patient number and Ph l status: SK-CML7Bt indicates a cell line 
derived from patient 7 that carries the 9$2 Ph 1 translocation; N 
indicates a normal karyotype. Myeloid-erythrpid cell lilies (K562, 
EM2, and BV173) are described in previous publications (9, 11, 22, 
33). 

T Status of patient at the time cell line was derived. BC, blast crisis; 
Acc, accelerated phase. 

♦Presence (+) or absence (— ) of Ph 1 as demonstrated by karyotypic 
or Southern blot analysis. 

tmQF** detected as described in legend to Fig. 1. B-cell lines 
derived from blast-crisis and accelerated-phase patients had levels 
of P210 3- to 5-fold higher (++ +) than levels of P145. Chronic- 
phase-derived cell lines had P210 levels lower than or just equivalent 

- (+) to the level of P145. Myeloid and erythroid lines had levels of 
P210 5- to 10-fold higher than P145 (+ + + + +). 

*Eight-kilobase bcr-abt mRNA detected as described in legend to 
Fig. 2. Symbols: ±, borderline detectable; .+++++, level of 8-kb 
mRNA 5- to 10-fold higher than that of the 6- and 7-kb c-abl mRNA 
species ; + + + , level of 8-kb mRNA 3- to 5-fold higher than that of 
the 6- and 7-kb species; +, a level approximately equivalent to that 
of the 6- and 7-kb messages. 

data not shown). There was no difference in the copy number 
of aW-related sequences as judged by Southern blot analysis 
(Fig. 4). Only the K562 cell line control showed an amplifi- 
cation of 'abl sequences, as previously reported (22, 23). 
These combined data suggest that differential bcr?abl mRNA 
expression from a single gene template is responsible for the 
variable levels of P210 c ~ obl detected. This could be mediated 
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Fig. 2. Analysis of steady-state abl protein levels by immuno- 
blotting. Cell extracts prepared from 2 x lO 7 cells of lines SK- 
CML7BN-2 (A-h SK-CML7BI-33 (A.+), SK-CML8BN-10 (*,-), 
and SK-CML8Bt-3 (B,+) were concentrated by intmunoprecip- 
itation with anti-pEX-2 plus anti-pEX-5. Samples were then electro- 
phoresed in a NaDodSC>4/89& polyacrylamide gel and transferred to 
nitrocellulose, using protease-facilitated transfer (18). abl proteins 
were detected using a mixture of two monoclonal antibodies directed 
against the pEX-2 and pEX-5 aW-protein fragments produced in 
bacteria (19) as a probe and a peroxidase-conjugated goat anti-mouse 
second-stage antibody (Bio-Rad) for development. • • 





Fig. 3. ■ Comparison of abl RNA levels in Ph l -positive and 
-negative B-cell lines. The levels of the normal 6- and 7-kb c-abl 
RNAs and the 8-kb bcr-abt RNA were analyzed by blot hybridization 
using a v-abl probe, RNA was extracted from Ph^negau've lines 
SK-CML7BN-2 (A>-) and SK-CML16BN-1 (B,-), from Ph l -pos- 
itive lines SK-CML6BI-33 (A,+) and.SK-CML16Bt-3 (£,+), and 
from line K562 (C;+) by the NaDods6 4 /urea/phenol method (20). 
Polyadenylylated RNA was purified by oligo(dT) affinity chroma- 
tography, and IS /xg of -each sample was electrophoresed in a 1% 
agarose/formaldehyde gel and then transferred to nitrocellulose. The 
blotted RNAs were hybridized with anick?translated v-aW fragment 
probe (21) and then autoradiographed for 4 days. . 



by factors influencing the transcription rate of the bcr-abl 
gene or the stability- of the mRNA; 

•i • * i* . 

DISCUSSION 

Several lines of evidence suggest that formation of Ph 1 is not 
the primary event that affects the stem cell in CML. Patients 
have been identified that present with the clinical picture of 
CML but only later develop Ph 1 (1). This observation, 
coupled with studies of G6PD (glucose-6-phosphate dehy- 
drogenase)-heterozygous females with CML that demon- 
strate stem-cell clonality by isozyme- analysis among, cell 
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Fio. 4. Southern blot analysis of abl sequences in fV-positivc 
and -negative B-cell lines.. High molecular weight DKA (15 jig) was 
digested with restriction endonuclease BamHl, separated in a 0.8% 
agarose gel, and then transferred to nitrocellulose. The blotted DNA 
fragments were hybridized with a nick-translated, 2.4-kb Bgt II v-abl 
fragment (1.5 X 10 s cpm//ig; ref. 21) and exposed for 4 days. (A) 
Autoradiogram of cM-specific fragments in cell lines HL-60 Cane 1), 
EM2 Cane 2), K562 (lane 3), SK-CML7Bt-33 Oane 4), SK-CML8Bt-3 
Oane 5), SK-CML16BM Oane 6), SK-CML21Bt-6 Oane 7), SK- 
CML35BI-2 Oane 8), SK-CML7BN-2 Oane 9), SK-CML8BN-2 Oane 
10), and SK-CML35BN-1 Oane 11). (B) Ethidium bromide staining of 
agarose gel prior to transfer to nitrocellulose, showing the level of 
variation in amount of DNA loaded per lane. 
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populations that lack the Ph 1 marker, supports a secondary 
or complementary role for Ph 1 in the progression of the 
disease (24, 25). This chromosome marker is found in 
chronic, accelerated, and blast-crisis phases of the disease. It 
is likely that Ph 1 confers some growth advantage, since cells 
with the marker chromosome eventually predominate the 
marrow and peripheral blood even in chronic phase. During 
the phase of blast crisis, many patients develop additional 
chromosome abnormalities, including duplication oT Ph, a 
variety of trisomies, and complex translocations (26). This 
is suggestive evidence for Ph 1 being a necessary but not 
sufficient genetic change for the full evolution of the 
disease 

The realization that one molecular result of Ph 1 is the 
generation of a chimeric bcr-abi protein with functional 
characteristics and structure analogous to the gag-abl trans- 
forming protein of the Abelson murine leuk*na virus 
strengthens the argument for an important role of Ph in the 
pathogenesis of CML. Although the Abelson virus is gener- 
ally considered a rapidly transforming retrovirus, its effects 
can range from overcoming growth factor requirements, to 
cellular lethality, to induction of highly oncogenic tumors in 
a number of hematopoietic cell lineages (27, 28). Even m the 
transformation.of murine cell targets, there are several lines 
of evidence that suggest that the growth-promoting activity of 
the v-aW gene product is complemented by further cellular 
changes in the production of the malignant-cell phenotype 
(29-31). 

The regulation of bcr-abi gene expression is complex 
because the 5' end of the gene is derived from the non-^W 
sequences, bcr, normally found on chromosome 22 (6). The 
level of stable message for the normal bcr gene and the 
normal abl gene are both much lower than the level of the 
bcr-abi message and protein from cell lines and clinical 
specimens derived from myeloid blast-cnsis patients (5, 6, 
11) Therefore, the high level of bcr-abi expression cannot 
simply be attributed to the regulatory sequences associated 
with bcr. Possibly, creation of the chimeric gene disrupts the 
normal regulatory sequences and results in a higher level of 
expression. Variation in bcr-abi expression may result from 
secondary changes in the structure of the chimeric gene or 
function of fraw-acting factors that occur during evolution of 
the disease. Our analysis of P210 c -* bl and the 8-kb mRNA in 
Epstein-Barr virus-transformed Ph a -positive B-ceil lines 
demonstrates that stable message and protein levels from the 
bcr-abi gene can vary over a wide range. This variation does 
not result from a change in the number of bcr-abi templates 
secondary to gene amplification but more likely from changes 
in either transcription rate or mRNA stability. We suspect 
this range of bcr-abi expression is not limited to lymphoid 
cells. Analysis of peripheral blood leukocytes derived from 
an unusual CML patient who has been in chrome phase with 
myeloid predominance for 16 years showed a level of 
P2l0<~ bl one-fifth that of PMS 6 **, as detected by metabolic 
labeling with [ 32 P]ortbophosphate and immunoprecipitation 
(S C O.N.W., and P. Greenberg, unpublished observa- 
tions). Lower levels of expression of the. chimeric mRNA 
have been demonstrated in clinical samples from chrome- 
phase CML patients compared to acute-phase CML patients 
(9) Others have reported chronic-phase patients with van- 
able but, in some cases*; relatively high levels of.the bcr-abi 
mRNA (32). The sampling variation and the heterogenous 
mixture of cell types in clinical samples complicate such 
analyses. Further work is needed to evaluate whether there 
is a defined change in YIVF** expression during the pro- 
gression of CML. It is interesting to' note that among the 
limited sample of Prepositive B-cell lines we have examined 
(Table 1), we have seen higher levels of P21<r abl in those 
derived from patic v. * at more advanced stages of the disease. 
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It will be important to search for cell-type-specific mecha- 
nisms that might regulate expression of bcr-abi from Ph . 
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1 latrododfon 

A proteomo has been doftad as Utc proUio complement 
eapreasod by tho geockse of an Of&aofem, or, io nuiitlccl* 
luUr organisms , as the, protein compjement cxpressed by s 
tksoc or diflcrcniUlcd cell (1J. Id the most common im- 
plementation of protetkme aoalysia the pioteins cxtrKtcd 
from tho coil or'tissua Bn&Iyitd art stp&zaied by high 

Om^tdttts:: fiefesnr ^u«4i At^mvoM. F>^w>twy>r &rX4f*!tcwltj 
dUUduxA>0 r Vofronity of ^ub)&sso«\ Box XTt'M, 3utile, Wa. 
91 Its, USA fW: t2^tWU5; t\r: +20WI5^3W; Encrti); iwt4I 
Og.¥»sbio|lok.ctfu) 

Ai>HcTUqiw; CQ>» ccHidoo-faitfocod dlsodtlioo; MS /MS, UmJccrv 
rpui xpedronitUf; SACE) «»bl iQftljtb of $cors txpru^ioa 

pboittti / TU44a» nun ipodnmctry 



2' R&tionnlo for proUosw nnalyds 

Tha dramatic growth In both the Dumber of genomo 
projects -and Iho epeed vilh which ganomo soqotncts 
are being determined bis generated huge amounts of 
sequence informitran, for some species erea complete 
genomic sequences fl)*-)7IX The desaiptioa of the 
state of a biological system by ihe quantitative measuro- 
meet of system components has toag been a primary 
,oye*#7c h mokaiter blobsy, rDcoot t^p M 
advances including the dereiopmenl of differential dU- 
pla^PCR WlcDHAmkromzr and DNA djip teebno- 
togy [I9 r 2Q) and serial analni« of gene- expression 
(SAOB) (21, UUi b nowfeaslbfo to establish global and 
quiiUlUtive mRHA expressloo maps of cells and tissues. 
In which the sequence of all tho genes is known, at a 
speed tnd sensitivity woks Is not matched by correal 
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protein anatyris technology. Olvcn the long-standing 
paradigm to biology that DKA aynfeesitei lUiA which 
ryatbestzes prolog and the ibOttj to rapidly establish 
global, quanmalivo mRHA ejpitafieo maps, the ques- 
tion* whkfc arise am why ItdmitaU/ complex proteome 
•project! should bo undertaken tod what specific types or 
io/bnniUda' could bo expected from proteome projects 
which cannot be obtained from genomic and transcript* 
. profiling projects. Wa see thito main reasons for pro- 
laomo analysis to become an essential component ia the 
ecahprehexuhe analysis of biological systems, fl) Protein 
exptvnloa levels are not predictable from the mRNA 
axpresatoo levels, (H) protein* at* djnamictlty modified 
and processed tn ways which are not ntcesaariry 
appaieal from the fooo aemience, and OH) proteemas 
are dynamic and reflect the stale of a biofegicBl system. 

XJ CoirtUlioa ber*t*a ai&NA aad prrtda.axptusfea 
krtls 
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interpretations of quantitative mRNA expression profile* 
nxqueatty rmptWUy or expUdtiy assume that for sped ffc 
genes the transcript levels are'fedicallve of the terete of 
protein expression. As part of an ongoing stud? la our 
laboratory, we.b*Yo determined tho. correlation of expres- 
sion at the mRNA and protein level* for a populaUon of 
selected genes (a the yeast Socchcuwrvca arrrUlae 
growing ai ndd-log phase (S. P. Oygi «r «£, submitted for 
publication). mRHA o apr aasi oo levels were catenated 
from pnbltshad SAOfi frequency Ubjes (22). Protein 
expression loteU trcro quantified by metabolic rsdiola- 
bellng of the yeast proteins, liquid sciniilUtion counting 
3f t£» ptotein 7pe£..topf#etod. bjr tigjb r?^utiott>DB . 
and taw spectrornatrie identification of the prote£n(»)* 
migrating to aeon spot The aelecUd 80 samples eoosti* 
' rule a.relitiTtiy rmmogeneoui group wiu> respect to pre- 
. dieted half-Ufa and expression level of the protein pro- 
ducts. Thus far; wo have found a general trend bat no 
strong oonetaiion between protein an<r transcript leVeta 
(Fig. IX Bv some ganea atadied equhraient mRrU trans- 
cript levels translated into protein abundances which 
varied b/more than 50-fold* $m!larijr» eaulvalecft steady- 
stain' protein expression levels were maintaioed'by trana- 
.crlp'l levels varyiog by as much as 4tMbM (S. P. Oygi 
tt a/. r submitted), these results rvggasU that even for a 
population of genes predicted to bo lelatrvebr .homoge- 
neous with respect to protein balHife and gene expres- 
sion, the protein levels cannot bo accurately predicted 
from the' level of the corresponding rnXHA transcript. 
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12 FreMat ate dynamically m#dJfled aae* processii 

Iri the mature, biologically active form rnany pro tons aie 
post-transtitionaDy modified by gitveosyhtion. phosphof- 
•yMoa, pswi;1sites. ^Isl^s, »Mi^sfe7<»£>6u ov tm ' & 
more of many other.snodifiealionj 123] and' many pro- 
(eias are only iunctiooal ffapedficaify associated or com- 
plexed witb qtber moleculea; fodoolng DMA, RNA, pro-' 
telna and organic and inorganic cofactoa Frequently, 
mbdiGcatlons are dynamic and mvernbie and may ailer 
the precise three-dimensional structure and the stato of 
activity of a proteJa Conectrvdn the stale ot* modifica- 
tion of Ibi proleins which constitute a biological system 



are important mdkatori for the slate of the system. The 
type of proteta modification and the sites modi/led at a ' 
specific cellular state can usually not be determined 
from Iha gene sequence alone. 

2-3 Proteocaef are dpuadc and reflect tbe stde *f s 
llafoffcal syitta 

A single genome can give rise to many qualitatively and 
quwtitathrely o^hTcrtnt Vft&Ctac*; Specific atag^ af . &ke - 
cell cyde and stales of differentiation, responses to 
growth and nutrient coxxHuons, teaape^ature and stress, 
and -pathological cooditkma represent cellular stater 
which are characterized by signU|canriy' 'diflcrent pro- 
teomes. Iho proteome, b prindpte, also reflects events 
that are under transhtionat and posMraaslaUoua) con- 
trol. It is therefore expected that pmttoenks will bo able 
to provide the moat preeto and deafled i^lfnrtitf dee* 
eription of the state of a cell or tissue, provided that the 
exlemal conditlona denning the state are carefully deter- 
mined. In answer to the question of whether the study 
-of pioteomea Is necessary for (he analysis of blomolec* 
ular systems, it is evident that the analyst* of mature pro-, 
tain products in cells is essential as there are 'numerous 
levels of control of protein synthesis;, degradation* 
processing and modUkaUoo, which, are only apparent by 
direct protein analysis. 



. 3 DesofplloD and assessment of current proteome 
Analysis, technology 

* m * 

m 

3.1 IVebalal requirements «f proteeme leaoitigy ' 

In biological systems the level of expression as well as 
the states of modification, processing nod macro-molec- 
ular association of. proteins are controlled and modu- 
lated depending on the stale- of the system. Compi,ehea> 
»ve analysis of the identity quantity and elate of modifi- 
cation of proteins therefore requires the detection and 
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quantitation of the proteins whkh constitute the system, 
aod analysts of dUTftrontbltf processed forms. There are 
t cumber or Inherent oKBcalto. to protein analysis 
which complfcei* these tasi*. Fbii, proteias cannot be 
implied 11 if possible to produce btfge emoooU of • 
particular protela by over-expression la specilic cell ry* 
lenj, However, cpce many protein* wo dy*%nk*tty 
.pe*tianr4*U»nan> modified, they cannot .be cgrity am- 
pflfiod id tho fomi la which the/ rlnairy ranctien in Ibe 
blofegleal system. !t u frequent*? dlfBcoSi to purify from 
(ho native source wffidtat amounts of * protein for 
mifyiii. From a tedmo^fkai point of view this traas- 
biti bio the need for high scnrWrlry aoatytical tech* 
okuea. Second, many proteins are modified aod pro- 
cased poiUrWsttaBBr.Tt^ons to addition to the 
protein kfcanry, (be structural basil (or dulerenuafly 
modified tsororra* also needs to bo delcrmIoo4.tno.dlf- 
'trlbotioo of o coestaat aineunl of protein over several 
ditfertatUDy modified lsefbrms further reduces the 
amount of each rpecfcs. evailabEe fox srjalysle. TA» cem- 
pkaitT and dTBsroico of posUnaMlvtui proteta opt- 
ing tfaof significantly coraplkalet psoteome studio*. 
Third, proteins vary dramattcelty with reject to their 
sohibfllftj to oommotuy used servenra. There tra few, if 
Bnr.sorveat cocrtuorts la which efl predates ere soluble 
andwbicri are also ceropalrab with protein analysis. This 
mates the tevetoprncal of protein purification methods 
particularly difficult since both-proteln purification and 
solubility hare to be achieved under ibe aamexondi- 
tiene, Detergents, ia particalar soona* dedecyt tuifata 
(SDSX are fceucntly Added to aqueous servant* to 
maintain protela mtubUiry, The cc«ipattTrfttty with SDS 



is a b^ato^late of SPS potyrxiTlajnide get elostio- 
ph^ WWAOB) om pretern acjwadon 

^techniques, thus, 5D3-PAOB and two^Joieasioaat get 
elcctroprmresis, welcb abo uses SDS and other-deter- 

.*geatav >m tao toast general and preferred axthoda for 
the purifestfem of Knap amounts of proteins, provided 
that activity decs not necessarily need to bo maintained. 
Uau>, the number of proteins b a ttven cefl eyitem is 
tyvltiUj fe the thousands. Any attempt to tdesttu} and 
categorise ail of these mast use methods whkh are as 

•■ rapid u-poj^o-to iAoV.eo«npte4ioa;of the project 
within a reasoaabie Omo ira0>e. Therefore, a successfut, 
gtacrai proteornks tedutotogr requires high seasjUriry, 
high throughput, the abDir/ to ditTerenUata di/TertntisUy 
modified protein* and the abffitr to c^iantlutheV dis- 
play and analyze ill tho proteiaa present in a sarnpte. 

* * 

3.2 2-D etictroahorests - nasi spactronatiT: a ctiaiaon 
• bsplenttntattoa of -pretaaiDa amarysls 

Tnt most cominoo cuncntly used implementation of 
•proteose* ariafytb'tedmotogy ia based oa the separation 
of proteins by (wo^iatasloaa! 0BF/5D5-FAOQ get 
tttksypbofru* an£i iheir £tihse^uent $denli5e5UoR.sod 
analysis by mass spoctrometrf (MS) or tanden mass 
spectrometry : ^S/M5> In 2J>&\ proteias are first separ- 
ated' by isoelectric fbeusirir 0EF) and then by SDS - 
PAGE, la 'the second, perptadicufar diajeiuloa Separ- 
ated proteias are visualized » Wfch sensilrVir/ by rlainlng 
or autcvsdMgrapby; producing lvo4Itncntk>na! arrays of 
proteias.' 2»J>B $osi are; at present, .the most commonly 
used means 'of globs) display of proteins In complex 



aampiei. The .reparation of thousands of protein*; has 
been achiered In a single get [H 15] and ditToreotlaUy 
- modified proteias are rlfequenUy rtparaied. Due to the 
compatibility of 2-DB with high ebaeentratloDJ. of deter- 
gents, protein denaturants and other additives promoting 
protein sotuoJIJIy, the technique. is. wrdeiy used. 

• * 

The second step otQto type of proteoma anaiyaia b the 
" idDnlificatloa and ajuryais of separated proteias, fodhJd- 
ual proteins rrom potyacxytimlde gola hive tradltfonafly 
been identified usmg ^terminal sequencfeg [Z^ 
.tatemaJ peptide aeqaendng [29, l°l iaimaaoblottiDg or 
coioJgnlioa with tnowa proteins PO^ The roceot drav 
natie growth of taige-acaJe . gtnoinic and expressed 
tcn^uenoa tag fJBST) sc^uenoa d a ta b a s es has resulted faya 
rundamental ciunda to the way proictos am identlilod ly 
chair amino acid seo > oeocc. Rather than by the traditional 
methods described above, protein sequence* tro iktw tre- 
qaenUy detertnlnad by .correlating mass special or 
Undem mass spectral data of pep tides desired from pro- 
teias, with the infornution contained in scqnenco data* 
bases PH3). 

m 

There are a number of atternatire onproaches to pro* ' 
teome aaaJysb carrentry under development Ibero is 
considerable foterost in developing a protcor&o aMtyets 
stragegy vasch bypasses 2-DE altogether, beonao it 
con^dered a ixbtrrery stow and tedious process, and 
because of pcrtcrred diificuillca la extracting protefaas 
rrom Ibe .gal matrix lor aaarysta. Howevet; 2-DB as a 
starUag poini tor proteome analysis has many .adraar 
taget compared to other techniques available today. The 
most f*g"*fi«"f strengths of the approach 
iixtode the rciatiTtly iihiforia- behavicc - of 4>w>tcln> b 
gets, Ibe ability to quantity spots and the high resolution 
and flnmlraneous display hundreds to tbausarMb of v 
proteins vtthln a reasonyWe time frame/ ' 

A schcmaUe diagram of a typical piocedsre of the idtnti- 
fixalion of gal^eparated proteins it shown In Hg> X ?to- 
tna spots cUtectedtn the get *re enayinaticatry or chernlr 
. catly Irapnenied and the peptide Gagrnents are isolated 
ibr aurysis, as already njdjciled, most, rrecjueniry by bi5 
or MS/MSVTbere are noinerous ' protocols for the geacr* 
atioa of peptide fragments from gxl*epsraled proteins. 
They can be grouped into two categories, digestion in 
the gel slice fla; 34) or digestion after ekctroUanrfer out 
% of the gel onto a suitable membrane Q29, 35— 37J and 
reviewed in (390. In most instances 'either technique Is 
applicable and yields good -results. The analysis of MS or 
*MS/MjS dafev is an imporUnt step In the whole process 
because MS instruments can generate an enormous 
amount of bformiUon which cannot easily be aunaged 
manually. RtcenUr, a noraber pf groups havo developed 
software system* dedicated to the use of peptide MS 
a&d MS/MS spectra Ibr the &en!&tA93an of proicia*. 
Froieins arc Meniificd by correiaring'tbe b/onnation 
contained ia (he MS spectra of protein digests or 
MS/MS rpectra of Individual pcplktcs with data * con- 
tained in 'DHA or protein sequence databaser. 

The systems we are cvrrenUy using In ou> tabontory are 
based oh tht separation of the peptides contained in pro- 
tein digests by narrow bora or capillary liquid otornatog* 
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raphy p9 r 40) or capillary electrophoresis |41L the mil* 
yd* of the separated peptide* by eacetruspray lonlsav 
tha (BSI) MS/MS, aod the correlation- of (be generated 
ptptklo spectra tftb 5tqueA.ce database* ado* the 
/JIQUBST program developed at the University oTWub- 
tnston pi, JH The system; satornalicau> performs "the 
foyowiag operations: a particular .peptide too character- 
ized by Us nussHo-cbargo ratio ii selected m the MS out 
of all the* peptide icus present in the mum al a part*- 
culax upd; the ^lecud >ptido1i>r&i^mi^ b a ceuV 
don' cell with argon tellbtoirtoduced olssecaatJoa, 
CID). and tba-oawe* of ibe resetting fragment loos are - 
determined In (be second sector of the tandem MSf ibir 
expcrinveotaily determined OD spectrum fa then corrc- 
Uled with the CtX> spectra predicted from all tb* pep* 
tides hi -a scquede o database wbkfc haw essentially the 
sain* mass as too peptide selected for CID? this cbrrcla- 
(too matches the isolated peptide with a tfquenre seg- 
meat to a database and thus identifies the pmuta from 
which ibe peptide was derived Ibere ace a number of 
aitetnatlvo programs which uso peptide- CID spectra for 
protein idcAtiAcalkKK but wc me the SEQUBST system 
because It is currently tho most falgbJy automated pro- 
gram and has proven to be suc ces sful, versatile and 
robust 



required. Ai to approximate guideline, for sample* con- 
tainug tent of pleomoie* of peptides, IC-MS/MS Is 
most tppropriaU* for samples coauining low pJeomole* 
amounts to bigb femtomoJe amounts wo use capiltary- 
' LC-MS/MS; and lor samples containing femtomole* or 
less, C8-M5/MS Is the method of choice. 

3JJ LC-MS/MS 

The coupling of an MS to an KPLC system using ax 
- OS mm dkmettr or blggar rottiso phase {RP) column 
has been described In detail (42]LTbis system has several 
. adraatagei tf a huge wooer of nmplcs are to be ana* 
- I/zed and all am available In sufficient quantity, lb* 
IC*M3 and database stastUng program can bo ma m a\ 
ruQy aatontated mode using to antntamplef, thus mud- . 
tnbi&g safi^le taxogghput and ini&imizlng Iht need for' 
operator interference. The relatively largo 'comma o 
tolerant of high ferels of Imptuitit* ftom dyher get prep- 
irsiioo cr samp|o tnalrbL Lastly, If conngured with '* 
flovf-spUtUr and'iniao*tpfayer HOi anar/ses can bo pw^ 
formed on a small traction of the sample (loss than 5 H> 
white (bo remainder of uW sample is recovered In Tery • 
pure solventa. Ihb tatter feature is particularly useful 
wbea an orthopnal technioue is also used to analyze 
peptide fractiom; tuch u a^hUllation of an mttoduced 
tacloUbcl, and thii data can be correlated witb peptides 
identified by CID specUa. 

W Opfllary LC-MS 

An iocmase of scuafthlty of apnraxhnatcly tenfold can be 
'aJucveli by dsmi aca^iiUry LC eystem wlth^ lui) tun ID 
cohuan rather thaa a OJ mm ID column as referred io 
above. Since teiy low Boer rales am required for such 
' cotDans, mosl'repofb have used "a precouioin' flow split- ' 
ting system, for ptoduclng eotveut. gradients. We have 
reccjuiy4est&ed the design and construction of % novel 
gradient mixing system which enable* .the formation 
of rcprodndbie gradient* al ycty lor flow rate* -(low 
nUmin) without the need for flow eplitting (A. Ducret ' 
a *K submllled for publication). Using this capiUary - 
LOMS/MS syitcm we were able |o ideatuV gel separat- 
ed proteins if Io? pioomole to high femlomole amounts 
were loaded onb the get (40). This system is as yet not 
automated ane\ fib all capittair LC systcins r Is prone to 
blockage of the cotamns by mieroparuculates when ana- 
ryzhtg gel-separated proteins. 



33 rroltlo IdeBtlflcalloa by LC-MS/MS, capllhvy 
LC-MS/MS and CE-M5/M5. 

It has'beea demonstrated repeatedly Dial MS has a very 
' high iarrrasfc tcnauviry. jPx-ihs routine anafysit of £&j- 
scpvat&d pnitUA? ii htgit settJIiWty, |bc .most rigritf- 
icinl challenge' is the handling of small amount* of 
sample The crux of ibe problem b the extracJioo and 
tiaiuferal of peptide roistures generated by the digestion 
of low nanogram * mounts of prulcin. from gels into the 
MS/MS' system without' sighificaDt lots of sample or 
Introduction of unwanted contsnunantl. % employ 
three (fttTertnl systems for introducing gel-purified sim- 
ples into an MS, depending on the level of sensitivity 



3JJ CR-MS/MS 

The highest level of ttooallfrity for analyzing gel-sep* 
anted proteins can be achieved by using capillary eJtc- 
tfoot^rtfc - mass spocu^metry (CB^MSJ. .We have 
ssthed in iiSc past i soiid*p&a*o extraction cdpiJftry etec* 
fjophoresb (5FE-CB) syttem'which was used wilh triple 
quajlnjpob and ios trap E5I-MS/MS systems for the 
identification of proteins at Ibe low femtomole to sub- 
femtomole stouihiiy level [43, 441 this sfsitm it 
highly sensitive, Its operation Is labor-tnteaaro and its 
operaikn has not been automated. In order to devise an 
analytical system with both tho'scasUmry of t CB and< 
the level of automation of LC; vc have constructed 
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microfibril ted device* for the iotioductjoo of samples 
bio BSW4S for Wtb^eHsllW^ peptide Kuiy*fe> 

The bade date* b a piece of flan Into which channels 
oHO-W pa la depth sad 50-70 jim » diameter ire 
etched by odni pbotoiilhc^rapby/dchiaf totfaafepes 
stoiihr to the one* used in tbo tcmloondactor fodastry. 
CA dtTk* fa shown b Ffc. J). Tl» dunnets are 
connected to an external high rcltaao power supptr (til 
Samples are pjujiimtttori oo'tbe device end off the 
device to tbo MS by applrbg different potentials to (be 
itseiroirt.Tbis octUf a sohrent flow by okctroosmotlc 
pumping which eao bo redirected or cbaagbf tbo post- 
Uoa of the electrode. Tbcfefbro, without tbo need for 
riSvet or fetea'and without any external pwnpbg, tho 
flow can bo redirected or ahnptT switching tbo position 
of tbo electrode* on tbo dovieenb* direction and rate of 
tbo m to icoAihtcd brtho riio ud O^eaJtjity 
of the electric field applied aod ilso by the durte atate 
of -the eaxiace. 

* • 

tbo trpeofdata feamted by tb* *?stm it dhistrated la 
Pig.^^lcb^w»thema«>pectruiaof»pcpUdowmpto 
reprotflfoj the tryptio digest of carbonic anhydiaft ii\ 
*2» fiaoiVL Each numbered peal indlcatee a peptide «ao- 
cwwfoUy Identified » boh* derived from caibonk to- 



hrfose. 5oa* oftbe-ttowtgood rijntb may bo chemical 
gt peptide cootsjninant*.lneMS ti ptopaf&med to nuto- 
utaikaOr select each peak aod rabject tbo peytlde to C1D. 
Ibe resulting CID ipectra are then used, to Identity the 
pioteia bycorrefciloo with n>qoeoco dolabaset .therefore^ 
Udi aysbaa aUowa at to cooairrtolry apply • numboa* of 
protein digest! onto tbo devtco, to eeepentlatfy mobitlzo 
the aaaplea, to autonttfcalry geoetato CID spectra of 
selected peptide loos aod to search seqocace database* 
fbrpfotoia JdeotWatioo. Tbeec «tepj ore peiforeied auto- 
meUctUr witbCHit the ocerf for user input and protdnt can 
bo UcaUScd at rtrj low femtooole torel sensllivitj at a 
nu of appredtnetelr one protein per 15 nin. . 

Agoiwnmt of Z*D&-MS pioteeae icchatl«C7 



t 



_ a comblnUloo of the analytical techniques do* 
scribed about we bare identiOod the SO proteb ipoU 
bdkattd In Pit S.'lfae proteb p»U«B wi» fci^ grrted pf 
sepantinf J a total of 40 mferbgram of protein co&tainpd 
fa a toUl ceM rystti of the feast rtrmln YTO4M by high 
rcaototbo and sirferatabl&i of tbo seps#atecfpn>^ 
tebt. lb estbaeie bow 6r tt>b type of proteomo tnttyaia 
can penetrate toward! |be identification of low abun* 
dasee pfoteb^ wo have calcnlattd the codon bias of tho 
gene? eooodki tbo rtspediro protefw. Codon* bias is a 



: 




n6 



« ■ 

of cwettfllc-iftbT^nse urini Ow vkrer*- 
Dficalcrf S7«ftm shovo to Fl#. J. 290 
fnoVtiL of pwbonk •fift/fttst Uypdc 
fyta m fe^osci f*<o i Pbolun LCQ 
Urn trip MS. E»4 pcU ru icicUod Tor , 
Qf^ tad Ibotf vfaka ven Uoallfted u 
cootuafaf psptWu fatnA tiea car* 
book iobftf rut ue oemeaed. Ref to- 
doccd toto H^L »*b pcmlnfoa. ' 
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calculated measure of the degree of redundancy of trip- 
. kt PKA codooi used to prodtfee each' ammfr acfir ft Y 
particular gene- sequence* Il has been shown to bo i 
useful indicator of the JotcI of the proteio product of a 
particular gene ceuueoce present la * c«B Tho geo- 
. end rule which epprie* is that the higher the vajue of the 
codon bias caiculaied for a gene, the more abundant (he 
protein product of thai gene becomes. The calculated 
codon bias values corresponding to the proteins Wcoti- 
fled in Wg. 5 are shows in Fig, 6b. Nearly all of (he pro- 
teins identified (> 95%) have codoa bias values of > 0.2, 
Indicating they aro highly abundant in ceils. In centrist, 
codtfo-blas values calculated for the entire yeast genome 
(Pig. 5a) ahow thai the msjoiily of proteins preseat in 
tho p.roiao/iJt itsv© it codon bid* of < G.2 and are thus of 
" low 'abundance. 

This finding is of considerable importance in our assess 
mcntof ihocurrout status of proteoroe analysis technol- 
6gy. It is dear lhat even using highly, sensitive analytical 
techniques, we are only abje to vbualize and identify the 



more abundant proteins. Smis many Important reguk* 
" ton? ptrHtftb aro'prcscnt only it low ibuwlanco,* these 
would oot be amenable to analysis using such tech- 
. nlque*. Tbia situation would be exacerbated in the anal- 
yab of proteemca containing many more proteins than 
the approximator/ 6004 gene products' present m yeast 
eetis (16]. Ia the analysis of, tor example, the proteomo 
of any human cdl* thero are potentially 50000-100 000 
gene products [471 Inherent limitation! on the amount 
of protein thai can be loaded on 2 DB, and the number 
of components that can be resolved, indicate that only 
the most highly' abundant friction of the many gene 
products could be succtsxjfulljr analyzed. One approach 
that fees be#s cpT.ptoyw} to cfemrafedi these iimittflopfi 
<t* the uso of very narrow range immobilized pH gradient 
ilxins for the -fint-dlmcwlon scpsrallon of 2-DB (dSJ. 
Since only those protein* which focus within ihe narrow 
range will cater the second dimension of separation, a 
much nJgber sample loading within (he desired range Is 
possible. Thl» r b lorn, cad lead to the vbuatii&fjon and 
identification, of less abundant proteins. 
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4 UtHttr ft pfeteeme anarysb far Melegtcal 
research 



For- the su fccess oTprotitomfcr ^nabrtream approach 
to the anaf/sb of biolegfcaj system* It If esseatiU la 
define* how proteome anaJtysb tod btojogkal tuarcb 
prefect* intersect Without a dear-plan lor tbo hnploaico- 
tnuon of projeome-type apptoacboi hrto blologlctl je~ 
stanh project* the fuli impact of too technology can ooi 
be realized. Tbo literature mdicatcs that pfoteome ana): 
nir b used both as a daiabaso/cara ertfch'e, and a* a bio- 
logical assay or btotogjcaJ research tooL 



* ■ 

* * 

tkm 2-DB map* of protein* tsohted from rat scrum. 
Fig. ?A if from the return of normal rats, while Fig, 7B 
Is from the strum of rata Is acutc-phase serum after 
prioi trtttmentwith an LJlamnuUoo-ciu»iig agent (49). 
U b obvious that the protein pattern* am- significantly 
dlffertat m several areas-, raiting Ihe-quesuou of exactly 
which proteome b being described 

■ 

therefore, a comprehensive proleome database of a spe- 
cies or ecO type needs Co contain all of (be pararoelexa 
which describe (ha state and Ibe type of tbo celts front 
which the proteins vata extracted at welt v tt)e software 
todU to search the database with eyerie* which reflect 
tbo dynamic* of biological systems, A comprehensive 
proteome database should be capable of quantltattvete 
describing the tete of each ptotela if specific systeof 
and psihwaya are activated la the cdt Specifically, tf* 
quantity, the degree ciaxoti&nikn, the subcellular loco- 
tfoa and the oaturrof mctodca spctliicaUy Infracting 
with a prorata a* well aa the rate of obtuse of these 
variables should be described Using iheae adoritiodsy 
stringent criteria, there ts cutztattf no conikte proteome 
database! A number of mh database* are, however; b 
the process of bcJag conitroeted. The inosf adrnneed 
asnoof Ihaov ia our opinion, iro the yeast protein data- 
base YPD P0| (accessible at bdp^/inrwjrpdeMn) and 
the Daman 2D-PAGB dfirah>y« of the Danish Centre 
fat Kamaa Oeoonie fiescarch |!2| (acctssale at bc^y/ 
blobase^tai^la/oelb)l White neither can be con* 
ajdered compiete as sot all of the potential feoo pro* 
fart* are identified, both, contain extensho aimotatJoo 
of supplemental iafonnaUoo tor man? of the spots 
which' are poffuVcr/ ideritified b referenoe umpire. 



4.1 Iba prateow as' • database 

7he use of protcoouct as, a database or dita archive 
essentlatty entails an attenpl to identify ad top protein* 
m i cell or species and to annotate each protein with the 
known biological Inibrniatioh that b .rdeVaet for each 
protein. The letel of annolatioa can, of course, bo exteov 
siye. The most common implemcnUtioo of (hit idea. ts 
the sepiratloo of protci&S;by bfgh resohiUon 2-DB» the 
idcatiOcatiod of each •detected protein spot and' the 
annotation of the protein spots in a 2-DB gel database 
format this approach is complicated by Die fact that it b 

whkh proteome should be represented hi the database. 
In contrast to toe genome of a species, whim it csscn- 
tiafy sUtic, the proteome is highly dynamic. Processes 
such as diflereafotion, ceil acsjratioo and'distase can all 
significantly change the proteome: of a specks. Ibis b 
ilrustraied in Fig. 7. The figure shows two bigb-resolu- 



the use. of proteoiao analysis as a bfohjgica! assay or 
research' toot represairis in aitecnatiro; approach to Into? 
' jrafing Wotogy wiih'pfoteomicst Tb bresUgate the state 
of a fysiem, sampio* are subjected to a «ped£c proceesa 
mat ahow* the cmantltallvo or qaalnatrre nteasujement 
of some of the TerJahice which describe tbo sys^aou'ln 
' fcmfcftl. MochomsoaJ assays one -tartahle enzymo 
acthrlty) of a single component (***/* particular en- 
zyme) is nuaiured. Using pn^eonilct as an .assay; nnuV. 
ti>ie yarisbba (c*. f ca^wsion teroVrate of syiitbosls, 
ehospnoryfatlon state, etc.) .are measured consweotty 
on many (ideally afl)*of the proteins In: a aampK Iho 
use of pro teenies as an assay b a less Teweaching prop* 
oshfon than the cooatrucuon of a comprthensho pro- 
teome database. It does, howorer, repiesent a pragmaiic: 
approach which can ee adapted to Investigate specific 
system* and pathways, as long as the interpretation of 
tbo,' results lahes into account that with current technof ' 
ogy not all of tfie variables which describe the., system 
can be observed (see Section 3.4). 

biological <mr l» when a 2-DB protein patiera geaer* . 
tied (tarn the anajysb of an omerimcotal sample b 
compared to an 'array 6f reTerenco patterns representing 
different stites of the system, under investigation. The 
state of the experimental system at the time the sample 
was generated is therefore determined by the quentita- 
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Uvc comparative analysis cf hundreds to a few thousand 
proteins. Coraparatlrc analysis of the 2-DB paltern* fur* 
laermore highlights quiadtaiive and qualitative differ* 
eocw la tbo protein profiles which correlate vilb the 
state of (be system. Fw thty type or analysis It -is not 
essential that all tbe proteinj arc identified or even ?lsu- 



alized, although tbe result* become more Informative a* 
more proteins are compared. It Is obvious, however, that 
tbe possibility to identify any protein deemed character- 
istic for a particular aiato dramatically enhances tlb 
approach by opening up new avtnu.es for expertmoota- 
Hon. 
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Proteome analysis u ft biological assay has been success- 
fully used io the field or toxicology, to chsrtcterize 
disease slates or Co study diflfe/atlftl ftctivatioa of colli. 
The approach is {united, of courts by the Act Cbil only 
the Visible protein ipoU an Included Io Ibe assay, tod li 
U well fcnowo that ft substantial but far from eomptcte 
fraction of cellular proteins are detected if ft total cell 
lynto is separated by 2-DB. Proteins rosy not be 
detected la 2-DB gels because they are not abundant 



£btS»t**mt> mt, /P % )]()- If? I 

m 

m • 

Vfe have outlined the two principal ways protoo'xne anal- 
ysis b currently, being used io intersect with biological' 
research project*: the proteome as a ditabase or data 
archive ind proteome analysis as • biological assay. Both 
approaches have in common that at present they are coa- 
ceptuallv uid technically limited. Current protcorno data- 
base*, typically aro limited to one cell typo and one state 
of. a ccB and therefore do not account for tho dynamics 
of biological systems. Tbe ore of protoomo analysis ftf a 



enough to bo visualized by the detection method used, biological amy can provide a wealth of Information, but 



It Is Hotted io the proteins detected and is there fori not 
truly pretcome-ride. These limitations la preteovak* are 
to a targe extent a reflection of the fact that pro tolas io 
their Aifly processed form cannot easily bo amplified and 
are therefor* ctifltaiU to isolate In amounts sufflcieutJer 
analysis or experimentation- The (act that to datejao 
complete proteetne his been described farther attest! In 



because they do not migrate within: lh« .boundaries (rise, 
pi) resolved by the get, because they we not soluble 
under the condition* used, or for other retsoox 

A different way to use proteome esurysJr as a biological 
assay to define the stale of a biological system Li to take 
advantage of the wealth of Information' contained In 

protein patterns, 2-DB is retorted to as tvo-dimen- these dttteulnc*. With continued rapid progress in pro- 

sionaJ because of the dotfsophorefJc coobJBty and the lain analysis technology, however, w* anticipate that the 

bodedric points whkh define the position of cadi pro- goal of eomptstc proteome anarvvb wiU ereproaily 

tein to* a 2-DB pattern. In add-on to the two.dimeo- become attainable. ' 
slons used to generate the protein patter^ a number of 

additional data dimensions are coaltinrd lathe protein Mr would likt to o^owkd^t thr fundtnj M our work 

patterns. Some of these dimension! suoh at protein from OuNattontl Sdaict foundation Sdence and Tcdxmol- 

expresaJon level, phosphorylation state, subcellular bca> ojy CknUrfor Molecular Bhstcknohsj and from tho 

lion, jjsoctitiott. wfth other proteins, rate of synthesis or toftduuik JVew Ro<hon end Bob Srotao Jot proridiat A* 

degradation Indicate the activity state of a protein or a yeast get shown Bisobt/to Gianaua for providing ike 

biologies! ayateat &nu>s*atlv« aoaJvsis of 2-DB protein rat strum fib shown. 
patterns representing different states is therefore ideally 

suited for the detection, identification and eaarysis of ftjccM Xpia'll, !»• 
suitahlo markers. Once again It must be emphaiizfd that 



to this type of cxperhncnt oojjr a ftactton of the ccflpjar 
proteins is anar/sed Sinod many leguiatory proteins are 
of tow abundaneev mis limitation is a concern, partial* 
tarty in cases lit' whkh regulatory yithvays -aio *eing 
mveitJgated. 

5 Csaduolng rtnutks 

In this report we hare addressed three mala fames 
rotated to ptoteoms analysis. Pint, we have dSscussed 
the ratio&aie for studying protamines. Second; we have 
assessed the ttxfcnkal (eastbUity of analyzing proteomes 
and described 'currenl.f^le o nio |echnology» end third; 
we hire analyzed the otiUty of proteoroe analvsii lor bio- 
togktl research. It b apparent that aroteeme ana)riii b 
an esseo.tisi loot io the analysts of biological systems. 
The aultHevtl coolrol of protein lynthesto s^ 'deg^ftpV 
tion In cofls means that only iha diroet analyfti of 
mature protein products can reveal their .correct adenti- 
Ues, their relevant state of modification and7or.associs; 
tlon and their amounts. . Recently developed methods 
have enabled too identification of proteins at ever* 
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Aneuploidy and cancer 

Subrata Sen, PhD 



Numeric aberrations in chromosomes, referred to as aneu- 
ploidy, is commonly observed in human cancer, Whether aneu- 
ploidy is a cause or consequence of cancer has long been 
debated. Three lines of evidence now make a compelling case 
for aneuploidy being a discrete chromosome mutation event 
that contributes to malignant transformation and progression 
process. First, precise assay of chromosome aneuploidy in 
several primary tumors with in situ hybridization and compara* 
tive genomic hybridization techniques have revealed that 
specific chromosome aneusomies correlate with distinct tumor 
phenotypes. Second, aneupfoid tumor celJ lines and in vitro 
transformed rodent cells have been reported to display an 
elevated rate of chromosome instability, thereby indicating that 
armupfoidy is a dynamic chromosome mutation event associ- 
ated with transformation of cells. Third, and most important, a 
number of mitotic genes regulating chromosome segregation 
have been found mutated in human cancer ceHs, implicating 
such mutations in induction of aneuploidy in tumors. Some of 
these gene mutations, possibly allowing unequal segregations 
of chromosomes, also cause tumorigenicJransformation of 
pells in vitro. In this review, the recent publications investigat- 
ing aneuploidy in human cancers, rate of chromosome instaba- 
ity in aneuploidy tumor cells, and genes implicated in regulat- 
ing chromosome segregation found mutated in cancer cells 

are discussed. Curr Opfn Oncol 2000, 12:82-88 O 3000 Lippincott Williams 
& WSWru, tnc 
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Cancer research over the past decade has firmly estab- 
lished chat malignant cells accumulate a large number of 
genetic mutations that affect differentiation, prolifera- 
tion, and cell death processes. In addition, it is also 
recognized that most cancers arc clonal, although they 
display extensive heterogeneity with respect to kary- 
otypes and phenotypes of individual clonal populations. 
It is estimated that numeric chromosomal imbalance, 
referred to as anenphidy, is the most prevalent genetic 
change recorded among over 20,000 solid tumors 
analyzed thus far fl]. Phenotypjc diversity of the clonal 
populations in individual tumors involve differences in 
morphology, proliferative properties, antigen expression, 
drug sensitivity, and metastatic potentials. It has been 
proposed that an underlying acquired genetic instability 
is responsible for the multiple mutations detected in 
cancer cells that lead to tumor heterogeneity and 
progression [21. In a somewhat contradictory argument, 
it has also been suggested that clonal expansion due to 
selection of cells undergoing normal rates of mutation 
can. explain malignant transformation and progression 
process in humans [3]. Acquired genetic instability, 
nonetheless, is considered important for more rapid 
progression of the disease Although the original 

hypothesis on genetic instability in cancer primarily 
focused on chromosome imbalances in the form of aneu- 
ploidy in tumor cells, the actual relevance of such muta- 
tions in cancer remains a controversial issue. 

Whether or not aneuploidy contributes to the malignant 
transformation and progression process has long been 
debated. A prevalent idea on genetics of cancer referred 
to as "somatic gene mutation hypothesis" contends that 
gene mutations at the nucleotide level alone can cause 
cancer by either activating cellular proto-oncogencs to 
dominant cancer causing oncogenes and/or by inactivat- 
ing growth inhibitory tumor suppressor genes. In this 
scheme of things chromosomal instability in the form of 
aneuploidy is a mere consequence rather than a cause of 
malignant transformation and progression process. 

In this review, some of the recent observations on the 
subject arc discussed and compelling evidence is 
provided to suggest that aneuploidy is a distinct form of 
genetic instability in cancer that frequently correlates 
with specific phenotypes and stages of the disease. 
Furthermore, discrete genetic targets affecting chromo- 
somal stability in cancer cells, recently identified, arc 
also discussed. These data provide a new direction 
toward elucidating the molecular mechanisms response 
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blc for induction of ancuploidy in cancer and may even- 
tually be exploited as novel therapeutic targets in the 
future, 

Genetic alterations in cancer 

. Alterations in many genetic loci regulating growth, 
senescence, and apopto$is, identified in tumor cells, 
have led to the current understanding of cancer as a 
genetic disease. The genetic changes identified in 
tumors include: subtle mutations in genes at the 
nucleotide level; chromosomal translocations leading to 
structural rearrangements in genes; and numeric 
changes in either partial segments of chromosomes or 
whole chromosomes (ancuploidy) causing imbalance in 
gene dosage. 

For the purpose of this review, both segmental and whole 
chromosome imbalances leading to altered DNA dosage 
in cancer cells arc included as examples of ancuploidy. 

Incidence of aneuploldy in cancer 

Evidence of ancuploidy involving one or more chromo- 
somes have been commonly reported in human rumors. 
Although these observations were initially made using 
classic cytogenetic techniques late in a tumor's evolu- 
tion and were difficult to correlate with cancer progres- 
sion, more recent studies have reported association of 
specific nonrandotn chromosome ancuploidy with 
different biologic properties such as loss of hormone 
dependence and metastatic potential [5]. 

Classic cytogenetic studies performed on tumor cells 
had serious limitations in scope because they were 
applicable only to those cases in which mitotic chromo- 
somes could be obtained. Because of low spontaneous 
rates of cell division in primary tumors, analyses 
depended on cells either derived selectively from 
advanced metastascsor those grown in vitro for variable 
periods of time. In both instances, metaphases analyzed 
represented only a subset of primary tumor cell popula- 
tion. Two major advances in cytogenetic analytic tech- 
niques, in siiu hybridization (ISH) and comparative 
genomic hybridization (CGH), have allowed better reso- 
lution of chromosomal aberrations in freshly isolated 
tumor cells (61. ISH analyses with chromosome-specific 
DNA probes, a powerful adjunct to metaphasic analysis, 
allows assessment of chromosomal anomalies within 
tumor cell populations in the contexts of whole nuclear 
architecture and tissue organization, CGH allows 
genome wide screening of chromosomal anomalies 
without the use of specific probes even in the absence 
of prior knowledge of chromosomes involved. Although 
both techniques have certain limitations in terms of 
(heir resolution power, they nonetheless provide a 
better approximation of chromosomal changes occurring 
among tumors of various histology, grade, and stage 



compared with what was possible with the classic Cyto- 
genetic techniques. Genomic plbidy measurements 
have also been performed at the DNA level with flow 
cytometry and cytofluoromctric methods. Although 
these assays underestimate chromosome ploidy due to u 
chromosomal gain occasionally masking a. chromosomal 
loss in the same cell, several studies using these 
methods have supported the conclusion that DNA 
aneuploidy closely associates with poor prognosis in 
various cancers [7.8]. This discussion of some recent 
examples published on aneuploidy in cancer includes 
discussion of studies dealing with DNA ploidy measure- 
ments as well. Most of these observations are correlative 
without direct proof of specific involvement of genes on 
the respective chromosomes. Identification of putative 
oncogenes and tumor suppressor genes on gained and 
lost chromosomes in aneuptoid tumors* however, arc 
providing strong evidence that chromosomes involved in 
aneuploidy play a critical role in the tumorigenic 
process. 

In renal tumors, cither segmental or whole chromosome 
aneuploidy appears to be uniquely associated with 
specific histologic subtypes [91. Tumors from patients 
with hereditary papillary renal carcinomas (HPRC) 
commonly show trisomy of chromosome 7, when 
analyzed by CGH. Gcrmline mutations of a putative 
oncogene MKT have been detected in patients with 
HPRC. A recent study [10J has demonstrated that an 
extra copy of chromosome 7 results in nonrandom dupli- 
cation of the mutant MKT allele in HPRC, thereby 
implicating this trisomy in tumorigencsis. The study 
suggested that mutation of MET may render the cells 
more susceptible to errors in chromosome replication, 
and that clonal expansion of cells harboring duplicated 
chromosome 7 reflects their proliferative advantage. In 
addition to chromosome 7, trisomy of chromosome 17 in 
papillary tumors and also of chromosome 8- in mesoblas- 
tic nephroma are commonly seen. Association of specific 
chromosome imbalances with benign and malignant 
forms of papillary renal tumors, therefore, not only 
contribute to an understanding of tumor origins and 
evolution, but also implicate aneuploidy of the respec- 
tive chromosomes in the tumorigenic transformation 
process. 

In colorectal tumors, chromosome aneuploidy is a 
common occurrence. In fact, molecular allclotyping 
studies have suggested that limited karyotyping data 
available from these tumors actually underestimate the 
true extent of these changes, losses of heterozygosity 
reflecting loss of the maternal or paternal allele in 
tumors arc w idespread and often accompanied by a gain 
of the opposite allele. Therefore, for example, a tumor 
could lose a maternal chromosome while duplicating 
the same paternal chromosome, leaving the tumor cell 
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\y\ih a normal karyotype and ploidy but an aberrant 
ullclotypc. It has been estimated that cancer of rhe 
colon, breast, pancreas, or prostate may lose an average 
of 25% of its alleles. It is not unusual to discover that a 
tumor has lost over half of Us alleles |4). In clinical 
settings, DNA ploidy measurements have revealed that 
DNA ancuploidy indicates high risk of developing 
severe pre malignant changes in patients with ulcerative 
colitis, who arc known to have an increased risk of 
developing colorectal cancer [Ml. DNA ancuploidy has 
been found to be one of the useful indicators of lymph 
node metastasis in patients with gastric carcinoma and 
associated with poor outcome compared with diploid 
cases [12,13], CGH analyses of chromosome ancu- 
ploidy, on the other hand, was reported to correlate gain 
of chromosome 20q wich high tumor S phase fractions 
and loss of 4q with low tumor apoptotic indices [14]. 
Aneuploidy of chromosome 4 in metastatic colorectal 
cancer has recently been confirmed in studies that used 
unbiased DNA fingerprinting with arbitrarily primed 
polymerase chain reactions to detect moderate gains 
and losses of specific chromosomal DNA sequences 
(IS). The molecular karyotype (amplotypc) generated 
from colorectal cancer revealed that moderate gains of 
sequences from chromosomes 8 and 13 occurred in 
- most tumors, suggesting that ovcrreprcscntation of 
these chromosomal regions is a critical step for metasta- 
tic colorectal cancer. 

In addition to being implicated in tumori genesis and 
correlated with distinct tumor phenotypes, chromosome 
aneuploidy has been used as a marker of risk assessment 
and prognosis in several other cancers. The potential 
value of aneuploidy as a noninvasive tool to identify 
individuals at high risk of developing head and neck 
cancer appears especially promising. Interphase fluores- 
cence h siru hybridization (FISH) revealed extensive 
ancuploidy in tumors from patients with head and neck 
squamous cell carcinomas (HKSCC) and also in clini- 
cally normal distant oral regions from the same individu- 
als f 16,! 7]. It has been proposed that a panel of chromo- 
some probes in FISH analyses may serve as an 
important tool to detect subclinical tumorigencsis and 
for diagnosis of residual disease. The presence of aneu- 
ploid or tctraploid populations is seen in 90% to 95% of 
esophageal adenocarcinomas, and when seen in 
conjunction with Barrctt*s esophagus, a premalignant 
condition, predicts progression of disease [18,19], 
Chromosome ploidy analyses in conjunction with loss of 
heterozygosity and gene mutation studies in Barrett's 
esophagus reflect evolution of neoplastic cell lineages in 
vivo |20J. Evolution of neoplastic progeny from Barrett's 
esophagus following somatic genetic mutations 
frequently involves bifurcations and loss of heterozygos- 
ity at several chromosomal loci leading to ancuploidy 
and cancer. Accordingly, it is hypothesized chat during 



tumor cell evolution diploid cell progenitors with 
somatic genetic abnormalities undergo expansion with 
acquired genetic instability. Such instability, often 
manifested in the form of increased incidence of ancu- 
ploidy, enters a phase of clonal evolution beginning in 
premalignant cells that proceeds oyer a period of rime 
and occasionally leads to malignant transformation. The 
clonal evolution continues even after the emergence of 
cancer. 

The significance of DNA and chromosome ancu- 
ploidy in other human cancers continue to be evalu- 
ated. Among papillary thyroid carcinomas, ancuptoid 
DNA content in tumor cells was reported to correlate 
with distant metastases, reflecting worsened progno- 
sis |2I|. Genome wide screening of follicular thyroid 
tumors by CGH, on the other hand, revealed. frequent 
loss of chromosome 22 in widely invasive follicular 
carcinomas [22]. Chromosome copy number gains in 
invasive neoplasm compared with foci of ductal carci- 
noma //? situ (DCIS) with similar histology have been 
proposed to indicate involvement of ancuploidy in 
progression of human breast cancer (231, LSH analyses 
of cervical intraepithelial neoplasia has provided 
suggestive evidence that chromosomes 1, 7 and X 
aneusomy is associated with progression toward cervi- 
cal carcinoma [241. 

Although the prognostic value of numeric aberrations 
remains a matter of debate in human hematopoietic 
neoplasia, there have been recent studies to suggest that 
the presence of monosomy 7 defines a distinct subgroup 
of acute myeloid leukemia patients 1251. It is interesting 
in this context that therapy-related myclodysplastic 
syndromes have been reported to display monosomy 5 
and 7 karyotypes, reflecting poor prognosis [26). 

The clinical observations, mentioned previously, arc 
supported by in viiro studies in human and rodent cells in 
which aneuploidy is induced at early stages of transfora- 
tion f27,281. It is even suggested that ancuploidy may 
cause eel! immortalisation, in some instances, that is a 
critical step proceeding transformation* 

Finally, in an interesting study to develop transgenic 
mouse models of human chromosomal diseases, chromo- 
some segment specific duplication and dclcrions of the 
genome were reported to be constructed in mouse 
embryonic stem cells (291. Three duplications for a 
portion of mouse chromosome 11 syntcnic with human 
chromosome 17 were established in the mouse 
germline. Mice with 1Mb duplication developed corneal 
hyperplasia and thymic tumors. The findings represent 
the first transgenic mouse model of ancuploidy of a 
defined chromosome segment that documents the direct 
role of chromosome aneusomy in tumorigencsis. 
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AneuploJdy as ''dynamic cancer-causing 
mutation" instead of a "consequential state" 
In cancer 

According to the hypothesis previously discussed, ancu- 
ploidy represents eirhcr a "gain of function" or "loss of 
function" mutation at the chromosome Jevcl with a 
causative influence on the tumorigenesis process. The 
hypothesis, however, is based only on circumstantial 
evidence even though existence of aneuploidy is corre- 
lated with different tumor phenotypes. The existence of 
numeric chromosomal alterations in a tumor does not 
mean that the change arose as a dynamic mutation due 
to genomie instability, because several factors could lead 
to consequential aneuploidy in tumors, also. Although 
aneuploidy as a dynamic mutation due to genomic insta- 
bility in tumor cells would occur at a certain measurable 
rate per cell generation, a consequential state of aneu- 
ploidy in tumors may not occur at a predictable rate 
under similar conditions or in tumors with similar 
phenotypes. In addition to genomic instability, differ- 
ences in environmental factors with selective pressure, 
could explain high incidence of aneuploidy and other 
somatic mutations in tumors compared with normal cells 
[4J. These include humoral, cell substratum, and cell- 
cell interaction differences between tumor and normal 
cell environments. It could be argued that despite 
similar rates of spontaneous aneuploidy induction in 
normal and tumor cells, the latter are selected to prolif- 
erate due to altered selective pressure in the tumor cell 
environment, whereas the normal cells arc eliminated 
through activation of apoptosis. Alternatively, of course, 
one could postulate that selective expression or overcx- 
prcssion of anti-apoptotic proteins or inactivation of 
proapoptoric proteins in tumor cells may counteract 
default induction of apoptosis in G2/M phase cells 
undergoing misscgrcgation of chromosomes. Recent 
demonstration of ovcrcxprcssion of a G2/M phase anti- 
apoptotic protein survtvin in cancer celts [30] suggests 
that this protein may favor aberrant progression of aneu- 
ploid transformed cells through mitosis. This would 
then lead to proliferation of aneuploid cell lineages, 
which may undergo clonal evolution, 

To ascertain that aneuploidy is a dynamic mutational 
event, various human tumor cell lines and transformed 
rodent cell lines have been analyzed for the rate of 
aneuploidy induction. When grown under controlled wr 
vitro conditions, such conditions ensure that environ- 
mental factors do not influence selective proliferation of 
cells with chromosome instability. In one study, 
Lcngaucrtf/d/ [3t»J provided unequivocal evidence by 
FISH analyses that losses or gains of multiple chromo- 
somes occurred in excess of 10 * per chromosome per 
generation in aneuploid colorectal cancer cell lines. The 
study further concluded that such chromosomal instabil- 
ity appeared to be a dominant trait. Using another in 
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vitro model system of Chinese hamster embryo (CHE) 
cells, Ducsbcrg et aL [32»| have also obtained similar 
results. With clonal cultures of CHE cells, transformed 
with nongenotoxic chemicals and a mitotic inhibitor, 
these authors demonstrated that the overwhelming 
majority of the transformed colonies contained more 
than 50% aneuploid cells, indicating that aneuploidy 
would have originated from the same ceils that under- 
went transformation* All the transformed colonies tested 
were mmorigenie. It was further documented that the 
ploidy factor representing the quotient of the modal 
chromosome number divided by the normal diploid 
number, in each clone, correlated directly with the 
degree of chromosomal instability. Therefore, chromo- 
somal instability was found proportional to the degree of 
aneuploidy jn the transformed cells and the authors 
hypothesized that aneuploidy is a unique mechanism of 
simultaneously altering and destabilizing, in a massive 
manner, the norma] cellular phenotypes. In the absence 
of any evidence that the transforming chemicals used in 
the study did not induce other somatic mutations, it is 
difficult to rule out the contribution of such mutations 
in the transformation process. These results nonetheless 
make a strong case for aneuploidy being a dynamic chro- 
mosome mutation event intimately associated with 
cancer, 

Aneuploidy versus somatic gene mutation In 
cancer 

The idea that numeric chromosome imbalance or aneu- 
ploidy is a direct cause of C3nccr was proposed at the 
turn of the century by Theodore Bovcri [331. However, 
the hypothesis was largely ignored over the last several 
decades in favor of the somatic gene mutation hypothe- 
sis, mentioned earlier. Evidence accumulating in the 
literature lately on specific chromosome ancusomics 
recognized in primary tumors, incidence of aneuploidy 
in cells undergoing transformation, and aneuploid tumor 
cells showing a high rate of chromosome instability have 
led to the rejuvenation of Boveri's hypothesis. The 
concept has recently been discussed as a "vintage wine 
in a new bottle" [34»]. The author points out that 
except for rare cancers caused by dominant retroviral 
oncogenes, diploidy does not seem to occur in solid 
tumors, whereas aneuploidy is a rule rather than excep- 
tion in cancer. 

Aneuploidy as an effective mutagenic mechanism 
driving rumor progression, on the other hand, is being 
recognized as a viable solution to the paradox that with 
known mutation rate in non-gcrmtinc cells (~10~ 7 per 
gene per cell generation) tumor cell lineages cannot 
accumulate enough mutant genes during a human life- 
time [35]. The concept is gaining significant credibility 
since genes that potentially affect chromosome segrega- 
tion were found mutated in human cancer. Some of 
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these genes have also been shown to have transforming 
capability in in vitro assays. Selected recent publications 
describing the findings are being discussed below in 
reference to the mitotic rargcts potentially involved in 
inducing chromosome segregation anomalies in cells. 

Potential mitotic targets and molecular 
mechanisms of aneuploidy 

Because aneuploidy represents numeric imbalance in 
chromosomes, it is reasonable to expect that aneuploidy 
arises due to misscgregation of chromosomes during cell 
division. There arc many potential mitotic targets, 
which could cause unequal segregation of chromosomes 
(Fig. I). Recent investigations have identified several 
genes involved in regulating these mitotic targets and 
mitotic checkpoint functions, which can be implicated 
in induction of aneuploidy in tumor cells. This discus- 
sion is restricted to those mi rode targets and checkpoint 
genes whose abnormal functioning has been observed in 
cancer or has been shown to cause tumorigenic transfor- 
mation of cells, in recent years. The role of telomeres is 
discussed elsewhere in this issue. For a more detailed 
description of the components of mitotic machinery and 
their possible involvement in causing chromosome 
segregation abnormalities in tumor cells, readers may 
refer to a recently published review [36>]. 

Among che mitotic targets implicated in cancer, ccntro- 
somc defects have been observed in a wide variety of 
malignant human tumors. Ccntrosomcs play a central role 
in organizing the microtubule network in interphase cells 
and mitotic spindle during cell division/ Multipolar 
mitotic spindles have been observed in human cancers in 
situ and abnormalities in the form of supernumerary 



Figure 1. Potential mitotic targets causing aneuploidy In 
oncogenesis 




O 2000 Mppintuti William* A WtHcint 



Diagram fcustrate* thai defects in sever*] processes invoking chromosomal 
spindle microtubule, end centrosomal targets, h addition to abnormal cytokine- 
sis, may cause unequal partilionbg of chromosomes during mitosis, leading to 
aneuptotdy. Recently obtained evidence in favor of some of these pessibifitiee b 
dtscuaaed in the text 



centrosomes, eentrosomcs of aberrant size and shape as 
well as aberrant phosphorylation of ccntrosome proteins 
have been reported in prostate, colon* brain, and breast 
tumors (37,38]. In view of the findings that abnormal 
eentrosomcs retain the ability to nucleate microtubules in 
vitro, it is conceivable that cells with abnormal eentro- 
somcs may missegregatc chromosomes producing ancu- 
ploid cells. The molecular and genetic bases of abnormal 
ccntrosome generation and the precise pathway through 
which they regulate the chromosome segregation process 
remain to be elucidated. Recent discovery of a ccntro- 
some-associated kinase STK 1 S/BTAK/auroraZ, naturally 
amplified and overexpressed in human cancers, has raised 
the intcresring possibility that aberrant expression of this 
kinase is critically involved in abnormal ccntrosome func- 
tion and unequal chromosome segregation in tumor cells 
[39,40], Exogenous expression of the kinase in rodent and 
human cells was found to correlate with an abnormal 
number of eentrosomcs, unequal partitioning of chromo- 
somes during division, and tumorigenic transformation of 
cells. It is relevant™ this context to mention rhat the 
Xcnopus homologuc of human ?JTKl5/BTAK/aurora2 
kinase has recently been shown to phosphorylate a micro- 
tubule motor protein XIEgS, the human orrholognc of 
which is known to participate in the ccntrosome separa- 
tion during mitosis 141]. Findings on STK15/aurora2 
kinase, thus, provide an interesting lead to a possible 
molecular mechanism of ccntrosomc's role in oncogene- 
sis. Ccntrosomcs have, of Iatc t been implicated in onco- 
genesis from studies revealing supernumerary ccntro- 
somcs in ^-deficient fibroblasts and overexprcssion of 
another ccntrosome kinase PLK1 being detected in 
human non-small cell lung cancer [42], 

One of the critical events that ensures equal partition- 
ing of the chromosomes during mitosis is the proper 
and timely separation of sister chromatids that arc 
attached to each other and to the mitotic spindle. 
Untimely separation of sister chromatids has been 
suspected as a cause of aneuploidy in human tumors. 
Cohesion between sister chromatids is established 
during replication of chromosomes and is retained until 
the next mctaphasc/anaphasc transition. It has been 
shown that during mctaphasc-anaphase transition, the 
anaphase promoting eomplex/cyclosome triggers the 
degradation of a group of proteins called sceurins that 
inhibit sister chromatid separation. A vertebrate sccurin 
(y-securtn) has recently been identified that inhibits 
sister chromatid separation and is involved in transfor- 
mation and tumorigencsis. Subsequent analysis 
revealed that the human sccurin is identical to the 
product of the gene called pituitary tumor transforming 
gene, which is overexpressed in some tumors and 
exhibits transforming activity in NIH3T3 cells, it is 
proposed that elevated expression of the v-sccurin mav 
contribute to generation of malignant tumors due to 
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chromosome gain or loss produced by errors in chro- 
matid separation {43»1* 

Normal progression through mitosis during prophase to 
anaphase transition is monitored at least ar two check- 
points: One checkpoint operates during early prophase 
at G2 to meraphase progression while the second 
ensures proper segregation of chromosomes during 
mctaphasc to anaphase transition. Several mitotic 
checkpoint genes responding to mitotic spiricjlc defects 
have been identified in yeast. The metaphase-anaphase 
transition is delayed following activation of this check- 
point during which kinetochorcs remain unattached to 
the spindle. The signal is transmitted through a kineto- 
chorc protein complex consisting of Mpslp and several 
Mud and Bub proteins f44J. It is expected that* for 
unequal chromosome segregation to be perpetuated 
through cell proliferation cycles giving rise, to ancu- 
ploidy, checkpoint controls have to be abrogated. 



Following this logic, Vogelstcin etah [4S»] hypothesized 
that ancuploid tumors would reveal mutation in mitotic 
spindle checkpoint genes. Subsequent studies by these 
investigators have proven the validity of this hypothesis 
and a small fraction of human colorectal cancers have 
revealed the presence of mutations in either hBubl or 
hBubRl checkpoint genes. Tt was further revealed that 
mutant BUBl could function in a dominant negative 
manner conferring an abnormal spindle checkpoint 
when expressed exogenously. Inactivation of spindle 
checkpoint function in virally induced leukemia has also 
recently been documented following the finding that 
hMADl checkpoint protein is targeted by the Tax 
protein of the human T«ceil leukemia virus type I* 
Abrogation of hMADl function leads to multinucleation 
and ancuploidy [46], 

In addition to mitotic spindle checkpoint defects* failed 
DNA damage checkpoint function in yeast is frequently 
associated with aberrant chromosome segregation as 
well. It, therefore, appears intriguing yet relevant that 
the human BRCAI gene, proposed to be involved in 
DNA damage checkpoint function, when mutated by a 
targeted deletion of exon 1 1 led to defective G2/M cell 
cycle checkpoint function and genetic instability in 
mouse embryonic fibroblasts [47J. The cells revealed 
multiple functional centrosomcs and unequal chromo- 
some segregation and ancuploidy. Although the molecu- 
lar basis for these abnormalities is not known at this 
time, it raises the interesting possibility that such an 
ancuploidy-drivcn mechanism may be involved in 
tumorigencsis in individuals carrying germline muta- 
tions of BRCA I gene. 



Conclusion 

Growing evidence from human tumor cytogenetic inves- 
tigations strongly suggest that ancuploidy is associated 
with the development of tumor phenotypes. Clinical 
findings of correlation between ancuploidy and tumori- 
genesis are supported by studies with /// vitro grown 
transformed cell lines. Molecular genetic analyses of 
tumor cells provide credible evidence that mutations in 
genes controlling chromosome segregation during 
mitosis play a critical role in causing chromosome insta- 
bility leading to aneuploidy in cancer. Further elucida- 
tion of molecular and physiologic bases of chromosome 
instability and aneuploidy induction could lead to the 
development of new therapeutic approaches for 
common forms of cancer. 
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Analysis of Genomic and Proteomic Data Using Advanced Literature 
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High-throughput technologies, such as proteomic screening and DNA micro-arrays Droduce VM > 
amounts of data requiring comprehensive anatytlca. methSs to decipherTht W 0 £^TreTan 

Lm^r 45 ^ a " 8Utornated "»«~«Who tool, termed M^Z Zm 

comprehensively summaries and estimates the relative strengths of all tmno^Zd^l 
relationships in Medline. Using MedGene. we analyzed a tJ^^rt^^Z£^JT . 
comparing breast cancer and normal breast tissue in the aw^JEE, ^STSK ^ 
corrcation between the strength of the literature ^d^%^Sc^2^~ ■ 
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Introduction 

At Its current pace, the accumulation of biomedical literature 
outpaces the ability of mosi researchers and clinicians to stay 
abreast of their own immediate fields, let atone cover a broader 
range of topics, For example, to follow a single disease, e«g*, 
breast cancer, a researcher would have had to scan 1 30 different 
Journals and read 27 papers per day In 1999.' This problem is 
accentuated with high- throughput technologies such as DNA 
micro-arrays and proteomlcs. which require the analysis or 
large datasets Involving thousands of genes, many of which are 
unfamiliar to a particular researcher. In any microarray experi- 
ment, thousands of genes may demonstrate statistically sig- 
nificant expression changes, but only a fraction of these may 
be relevant to the study, The ability to Interpret these datasets 
would be enhanced If they could be compared to a compre- 
hensive summary of what is known about all genes. Thus, there 
Is a need to summarize existing knowledge in a format that 
allows for the rapid analysis of associations between genes and 
diseases or other specific biological concept*, 

One solution to this problem Is to compile structured digital 
resources, such as the Breast Cancer Gene Database* and the 
Tumor Gene Database. 2 However, as these resources are hand- 
curated, the labor- Intensive review process becomes a rate- 
limiting step in the growth of the database. As a result, these 
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databases have a limited scale and the genes are not selected 
In a systematic fashion. 

An alternative approach is automated text mining; a method 
which involves automated information extraction by searching 
dements for text strings and analyzing their frequency an? 
context. This approach has been used successfully in several 
instances for biological applications. In most cases, it has been 
applied to extract Information about the relationships or 
nteracrions that proteins or genes have with one another. In 
the literature or by functional annotation.'-' Thus far few 
pub icaUon have applied text-mining to examine the global 
relationships between genes and diseases. Pere^Iratxeta e t al 
automatical^ examined the GO (Gene Ontology) annotation 
of genes and their predicted chromosomal locations In order 
to Identify genes linked to inherited disorders.* 

To obtain a more global understanding of disease develop, 
ment. it would be valuable to Incorporate Information regarding 

ISSSS 1 ! P f harm r IOfilCal ^""ologli as well as 
gjwtfc. This Information would enable comprehensive com- 
parisons between large experimental datasets and existing 

V* HteratUfe ' ?hU WouId *«™pllsh two things. 
Ftat At would serve to validate experiments by demonstrating 

I„ P h 1^ Bht WhiGh gCrteS « c °™borated * «* lUerature 
and which genes are novel In a given context. We have utilized 

a computational approach to literature mining to produce a 
Jcajrml ot Protean* Research 2001 1 405 
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Analysts ot vata Using Advanctd littmtun Mining 
Table 1. Systematic Sources of false Positives and False Negatives in UnHltered Oata* 
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source of error 



error type 



gene symbol/name 
Is not unique 



gene symbol Is 

unrelated abbreviation 
gene symbol/name 

has language meaning 
nonstandard syntax 
unofficial gene name/symbol 
nortspeclfted gene name 



example 



false positive 



false positive 

false positive 

false negative 
false negative 
false negative 



Ai4<#— myelin 

associated glycoprotein 
W^C-maJIgnancy-asso dated 

protein 

/VHpaUid homologue (mouse). 

M,P? 11 ! din W 80 for Pennsylvania) 

W45--VVlskott--Aldrlch Syndrome 

(aJso the word "was") 
BAG+t Instead of BAC1 
P$3 Instead of TPS3 
estrogen receptor Instead of 
Estrogen receptor 1 



Cher solution 



eliminate thts term 



eliminate this term 
case-sensitive string search 

add dash term 

add all gene nicknames 

add family stem term 



* In preliminary studies, Medline w&ss&ttcfed for co-anifT^^^. jj,. . , 

were amenable to global fitters. Each errtHmlce Z cTtt^d^tt^^t *"* V^*** 1 "* **r* evaluated to tkmilV error sources that 

faisenegatlves are Sal relationship, that ^Z^S^ fc" 

error. In general error rates maximized sensitivity, V*en at the «^^^ft^,f^ed **" telf introduces 



added for multiple occurrences of a term or the co-occurrence 
of multiple synonyms for the same gene key. 

Medline records were searched with all qualified gene 
identifiers, such as the officl a l/preferred gene symbol, the 
official/preferred gene name, all gene nicknames and all syntax 
variants. In situations where there are several members of a 
gene family or splice variants, some authors prefer to use a 
shortened gene family name. e.g„ estrogen receptor Instead of 
estrogen receptor I (ESRQ. creating a source orfalsenegattves. 
For this reason, gene family stem terms were created for all 
genes that have an alpha or numerical suffix (eg.. IL2RA t TGFfl, 
ESRl, etc.) and then used to search the literature. The 'family 
stem terms were handled separately from the specific gene 
names so that It would be clear when linkages were made to 
the gene family versus a specific member In that family, 

To Improve performance and accuracy, some pre-selectlon 
was applied to the records that were scanned. First, review 
articles were eliminated to avoid redundant treatment of 
citations. Second, non- English Journals were removed because 
the natural language filters were only relevant to English 
publications, Finally, journals unlikely to contain primary data 
about gene-disease relationships were also removed (e.g. Int 
I Health Educ, Bedside Nurse, and / Health Econ). Together 
these filters reduced the 12 198 221 Medline publications flulv 
2002) by 37%. 7 

Ranking th* Relative Strengths of Gene Disease Associa- 
tions, In total, there were 618 708 gene-disease co-dtatloris 
In which 16% (8297) of all studied genes had been associated 
to a disease and 96% (3875) of ail diseases liad been associated 
to at Jeast one gene. To rank the relative strengths of gene 
disease relationships, we tested several different statistical 
methods and examined the results. With the exception of the 
relative risk estimates, the methods provided similar results 
with respect to the rank order of the gene-disease association 
strengths. However, after comparing the results to other 
databases and after consulting disease experts, the log of the 
product of frequency (LPF) was selected for further analysis 
because It gave the best results overall. 

Validation of MedGeiie. In developing this tool, It was 
Important to minimize the number of missed genes (raise 
negatives) and miscalled genes (false positives). However, in 
situations when these goals were In conflict, Incluslveness was 
prioritized. To determine the false negative rate In MedCene. 
breast cancer was used as a test case because It was associated 
with more genes than any other human disease and because 
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2?! V aUmrt,an of lh ° negative rate by comparison 

Identified by MedGene were compared with those listed in 
nrnw ^ B dateb °*« '"eluding *e Tumor Gene Database^ 

GC) and Swls*prol" Genes were considered false negatives 
K they wererepresonted in at least one of these other debases 
and not In MedGene and their link to breast cancer was su^ 
ported by at least one literature reference. All literature references 

nZ^T * I™ 1181 review t0 connrm «M* the 
nurnber of gene, |„ each database or shared by more than one 

database is indicated. The false nageti ve rate was calculated Cv 
ganes In other databases (285). ^ 

there were several public database that link genes to breast 
cancer We compared the list of breast cancer-related genes 
from MedGene to these databases. Illustrated In Figure I 
Among the 285 distinct breast cancer- related genes that were 
supported by at least one literature citation in these hand- 
curated databases. 26 were absent from MedCene. suggesting 
afalse negative rate of approximately 9%. To determme why 
these were missed all literature references for these genes (80 
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comprehensive set or genedisease relationships. In addition, 
we have developed a novel approach to assess the strength of 
each association based on the frequency or citation and co- 
citation. We applied this tool to help Interpret the data from a 
large micro-array gene expression experiment comparing 
normal and cancerous breast tissue. 

Methods 

MedGene Database. MedCene Is a relational database, stor- 
ing disease and gene Information from NC8I. text mining re- 
sults, statistical scores, and liyperlinks to the primary lit- 
erature. MedGene has a web-based user- interface for users to 
query the database (http://hipseq.med.harvafd.edu/MedCenaO. 

Text Muiing Algorithms. MeSH Tiles were downloaded from 
the MoSH web site at NLM (Nation Library of Medicine) (http:// 
www.nlm.nlh.gov/mesh/meshhome.html) and human disease 
categories were selected. LocusLlnk flies wore downloaded from 
the LocusLlnk web site at NCBI (http://www.ncbl.nlh.gov/ 
LocusLlnk/). Official/preferred gene symbol, official/preferred 
gene name, and gene alternative symbols and names all 
relevant annotations and URLs for each LocusUnk record, were 
collected. Cene search terms were used for literature searching 
and Included all qualified gene names, gene symbols, and gene 
family terms. Primary gerie keys, predominantly qualified gene 
family terms and gene ofnclal/preferred symbols, were used 
to Index Medline records. Ifthe official/preferred gene symbols 
did not meet the standards to be an Index, then qualified gene 
official/preferred names were used. A local copy or Medline 
records (up co July. 2002) was pre-selected. 

A JAVA module examined the MeSH terms and then indexed 
each Medline record with the appropriate disease terms A 
separate JAVA module was used to examine the titles and 
abstracts for gene search terms and then to Index the gene- 
related Medline records with the relevant primary gene key(s). 

Statistical Methods. For every gene and disease pair wc 
counted records that were Indexed for both gene and disease 
(double positive hits), for disease only (disease single hits) for 
gene only (gene single hits), and For neither gene nor disease 
(double negative hits) to generate a 2 x 2 contingency table 
On the basis of the contingency table-framework. we applied 
different statistical methods to estimate the slrength of gene- 
disease relationships and evaluated the results. These methods 
Included chl-square analysis. Fisher s exact probabilities rela- 
tive risk of gene, and relative risk of disease" (http// 
hlpseq.med.harvard.edu/MedCene^. In addition, wc computed 
the product or frequency', which Is the product of the 
proportion of disease/gene double hits to disease single hits 
and the proportion of disease/gene double hits to gene single 
hits. To obtain a normal dlstrlbutloa we transformed all the 
statistical scores using the natural logarithm. We selected the 
log or the product of frequency (LFf) to validate MedGene and 
to use for the analysis with the micro-array data. Spearman 
rank-correlation coefficients were used to assess the linear 
relationship between LPF and mlcro-array fold chanae In 
expression level. * 

Global Analysis. Diseases with at least 50 related genes were 
selected for clustering analysis, and the LPF scores were 
normalized with total score for each disease. Hierarchical 
clustering was done with the -Cluster' software and the 
clustering result was visualized using TreeVlewer' (http // 
rana.lbLgov/EisenSoftware.htm). 
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Breast Tissue Micro- Arrays. Eighty- nine breast cancer 
samples (19% ER-posltlve) and 7 normal breast tissue samples 
were selected from the Harvard Breast SPORE frozen tissue 
repository and were representative of the spectrum or histo- 
logical types, grades, and hormone receptor Immunc-pheno- 
types of breast cancer. Blotlnylaled cRNA. generated from the 
total RNA extracted from the bulk tumor, was hybridized to 
Aflymetrtx U95A ollgo- nucleotide micro-arrays. These micro- 
arrays consist of 1 2 400 probes, which represent approximately 
9000 genes. Raw expression values were obtained using CENE- 
CHIP software from AfTymetrtx, and then farther analyzed using 
the DNA-Chlp Analyzer (dChip) custom software. 

Results 

Automated Indexing of Medline Records by Disease and 
Gene. To study the gene-disease associations In the literature 
we first compiled complete lists for human diseases and human 
genes. To Index all Medline records that were relevant to 
human diseases, the Medical Subject Heading (MeSH) Index 
or Medline records was utilized. MeSH Is a controlled medical 
vocabulary from the National Library of Medicine and consists 
of a set of terms or subject headings that are arranged in both 
an alphabetic and an hierarchical structure. Medline records 
are reviewed manuaHy and MeSH terms are added to each with 
software assistance.'. ■« Twenry-three human disease category 
headings along with all of their child terms (see the Supporting 
Information. Supplemental Table I, or visit http://hlpseq 
med.harvardedu/MedCene/pubilcatlon/s Table l.htmD were 
selected from the 2002 MeSH Index creaUng a list of 4033 
human diseases. 

No Index comparable to the MeSH Index exists for genes 
and thus. It was necessary to apply a string search algorithm 
for gene names or symbols found In Medline text. A complete 
list or genes, gene names, gene symbols, and frequently used 

MrTn T tf! C ° I,eCted * om the LocusUnk *«»b»*« « 
NCB1," » which contains 53 259 Independent records keyed 

by an official gene symbol or name (June ig* 2002). For the 

purposes or this study, no distinction was made between R enes 

both, differentiating the two only by the use or italics, If at all 

.11 I I" ° f thiS Study ' ,nIs •«* "I dWinctlon is 

unlikely to have a large effect and may In fact be beneficial 

Initial attempts to search the literature using these lists 

nT? S ° UrCeS ° f f3lse f** M *™ &«* negatives 
(Table 1). Fake positives prlmarUy arose when the searched 
term had other meanings, whereas false negatives arose from 
symax discrepancies necessitating the development or niters 

bv 'SXZi o™*- ThC toues read »y handled 
by Includ ng alternate syntax forms In the search terms. The 

fcbe positive cases, caused by duplicative and unrelated 

meanings for the terms, were more difficult to manage. Where 

SSH T° SenSW,Ve mapp,n « redueed 'na/prOFlate 
the terms had to be eliminated entirely, thereby reducing the 
febe positive rate bur unavoidably under-representing LZ 

selZ^L PUrPOSeS ° f ? ata traek,n * a f Hmar y * en * hf was 
~™ mJu\ epreSCm a " WW™ tha « «™«pond to each 
gene. Medline records were Indexed with a primary gene key 
when any synonym for that key was found In tne title £ 
absn-act. Case-lnsensitlve string mapping was used for all 
***ches except as noted above. No additional weight was 
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- papers) were reviewed manually (see the Supporting Informa- 
tion, Supplemental Tabic 2. or visit http://hlpseq.med, 
harvard.edu/MedGene/publlcatlon/s_Table 2.htmD. Among 
these papers, most false negatives were caused by nonstandard 
gene terms or gene terms eliminated by our specificity niters. 
Few genes were missed because they were only mentioned lit 
review papers (0.4%) or they appeared only in the body of the 
manuscript but not the abstract or title (l.|%). Of note 
MedCene identified approximately 2000 additional breast 
cancer-related genes not listed In any other database. 

To assess the false positive error rate, two complementary 
approaches were used: a detailed analysis of one disease and 
a global examination of 1000 diseases. The detailed approach 
examined the false positive error rate and Its sources, whereas 
the globar approach tested whether the overall results made 
biomedical sense. 

Using the LPF, H67 genes related to prostate cancer were 
assembled In rank order. We then retrieved approximately 300 
Medline records each for the highest ranked 100 and the lowest 
ranked 200 genes and manually reviewed the titles and 
abstracts to determine the verity of the association. Nearby 80% 
of (he highest ranked 100 genes fell into one 'of the Ave 
categories that reflect meaningful gene^dlsease relationships 
(see the Supporting Information, Supplemental Table 3, or visit 

http://hlpseq.med.harvard.edu/MedGene/publlcatlon/ 
sjabie 3.htm0. Among the lowest ranked 200 genes, ap- 
proximately 70% reflected true relationships. Of the 600 records 
reviewed, there were only two In which the association between 
the gene and the disease was described as negative, Both were 
genes with very low scores. In both cases, the authors did not 
argue the absence of any relationship, but rather that a 
particular feature or the gone or protein was not shown to be 
related to human prostate cancer. 11 - 1 * 

The coincidence of some gene symbols with medical ab- 
breviations, chemical abbreviations and biological abbrevla* 
tions resulted In most of the false positives (see the Supporting 
Information. Supplemental Table 4, or visit http://hipse- 
q.med,harvard.edu/MedCene/publIcatlon/s.Tabie 4.htro|), em- 
phasizing the Importance or the filters that were added In the 
search algorithm (Table I). Without the niters, the false positive 
rate more than doubled, and the false negative rate rose 
dramatically (data not shown). For example, among ihe papers 
about breast cancer, there were only 12 Medllfte records that 
referred to ESR1 and 10 to ESR% whereas almost 2000 papers 
mentioned estrogen receptor without specifying ESJU or fS/Zft 
this latter group was detected by the family stem term niter. 

To further validate these results, a global analysis of the gene- 
disease relationships described by MedCene was performed. 
For this experiment, It was reasoned that the more closely 
related (lie diseases are to one another, the more they will be 
related to the same gene sets. Thus, If the relationships denned 
by MedCene accurately reflected the literature, then an unsu- 
pervised hierarchical clustering or the gene data should group 
diseases In a manner consistent with common medical think- 
ing. Conversely, If (tie clustered diseases do not make sense 
biologically or medically. It may reflect excessive false positives, 
false negatives, or Inappropriate scoring of the data. 

To execute this experiment, the gene sets and the corre- 
sponding LPF values for 1000 randomly selected diseases (each 
with at least 50 gene relationships) were used as a da Us et for 
clustering (he diseases. A review of the results showed that the 
resulting disease dusters were Indeed logical based upon 
common medical knowledge (see the Supporting Information, 
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Supplemental Figure 1, or visit hrtp://hlpseq.med.harvard.edu/ 
MedCene/publlcatlon/s_Flgure l.htmO. For example, In one 
such cluster shown In Figure 2, diabetes and Its complications 
grouped together and were also closely linked to disease* 
associated with starvation states. 

The number of genes associated with a given disease can 
be estimated by adjusting the MedCene number up by the false 
negative rate (-9%) and down by the false positive rate (-26% 
on average). Using this, the average disease has 103.7 ± 45,3 
(mean i s,d.) genes associated with It, although the range Is 
quite broad with 2359 genes related to breast cancer, 2122 
genes related to lung cancer and no genes related to a number 
of diseases. 

Applying MedCene to the Analysis of Large Datasets. Access 
to a comprehensive summary of the genes linked to human 
diseases provided an opportunity to analyze data obtained from 
a high-throughput experiment We compared the MedGene 
breast cancer gene list to a gene expression data set generated 
from a micro-array analysis comparing breasl cancer and 
normal breast tissue samples. Micro array analysis Identified 
2286 genes that had greater than a Infold difference in mean 
expression level between breast cancer samples and normal 
breast samples. Using MedGene, we sorted the 22«6 genes into 
four classes; 555 genes directly linked to breast cancer In the 
literature by gene term search (first-degree association by gene 
name); 32g genes directly linked by family term search (first- 
degree association by family term); 1021 genes linked to breast 
cancer onry through other breast cancer genes (second-degree 
association); and 505 genes not previously associated with 
breast cancer. (See the Supporting Information, Supplemental 
Figure 2, or visit http://hlpscq,med.harvard.edu/MedGene/ 
publlcailon/s.Flgure 2.htmlJ Among the 505 previously un- 
related genes. 467 were either newly idemined genes or genes 
that had not previously been associated with any disease 
Among the remaining 38 genes. 9 had been related to other 
cancers, specifically esophageal, colon, uterine, skin, and cervix. 

To determine whether the genes highlighted by the micro- 
array analysis were more likely to have been previously linked 
to breast cancer In the literature, we created a two-dimensional 
plot of the fold change of expression level between breast 
cancer and normal tissue versus the literature score (LPF) 
(Figure 3A). There was a broad spread of expression chances 
among the genes directly linked to breast cancer ranging from 
less than 1-fold change (6S96) to over 40-fold (0.3%). Notably 
the majority of genes with greater than 10-fold expression 
changes were linked to breast cancer by first-degree associa- 
tion. 

Among all 754 ^enes directly linked to breast cancer In the 
literature, there was no correlation between LPF and micro 
array fold change (r « 0.018. Rvalue - 0.62). However, when 
we stratified the analysis based on the magnitude of the fold 
change, we observed an Increasing trend In correlation (Figure 
3B) suggesting that genes with a more substantial change In 
expression level were more likely to have a stronger association 
in the literature. For genes that had 10-fold change or more In 
expression level, the correlation increased to 0.41 (rvalue ~ 

When we evaluated the micro-array data separately for ER 
positive and ER negative tumors, the trend In correlation 
oerween Told change and literature score was highly dependent 
on estrogen receptor status. Interestingly, there was a similar 
trend in correlation for ER positive tumors, but no trend In 
correlation for ER negative tumors. 
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finally, to validate our findings, we computed similar cor- 
relations between the breast cancer expression data and 
LPF scores generated by MedCene for hypertension, a 



disease unrelated to breast cancer. As expected, we did not 
observe an Increasing trend In correlation for hyperten- 
sion. 



Journal ©» Prateome Research . Vol. 2. No. 4. 2001 409 




HuetoL 




Microarray Fold Change (Cancer/Normal) 




Microarray Fold Change 

Figure 3. Relationship between literature score and functional data for breast cancer 1A. The dat. r » 

samples for breast tumors and normal breast tissue were analyzed to Minte t£ hZ'iw ? ° n ^P"** 100 Bna| y s, » <* 

tumor and norma, samp.e (cutoff e 3.fo,d change). CJ22^£X£ iSlS^^^ 'T' b ° tW ° en *~ 
Green dot, represent first-degree association by gene search, bK2 £ SiS, ^ ? "T* " t 

dots represent no-association. Some well-studled qenes Such as BllCtt Mrtri£i i ^ T a 1 f soclatJon and red 

expression (evel. Furthermore, the majority of SSJ^^^^^S^S^? "* ^ * ' * Ub5U,n<,a ' difference 
expression changes (shaded area). 38. Th? SpJSmln , ank c^atior^S.S " ^ ' il6ratUre h8d 1655 lha " 10 to,d 

of expression leS,. between tumor and nprnS^la^^^SS^ ^ f 0 " <LPF) 8nd f °' d Ch8n 9 e 
Or-exls). Gene rank lists were generated for tinuiSS^tS^S!^ E£ " mou 1 nt 1 of fo,d of expression*. evel 
the breast cancer gene LPF scores and fold chang . £• i^S^SS C f re,aU< ' ns ww> «"«».com P uted between 

estrogen receptor negative tumors only (purple). ^ m °" 9 eStr ° 9an reCeptor P" 1 ** «Hy OWi blue) and 



410 Journal efProteorne Research* Vol Z, No. 4. 2003 





'Analysis oTVato Using Advanced literature Mining 
Table 2, Top 2$ Genes Related to Selected Human Diseases* 



research articles 



breast neoplasms 


hypertension 


rheumatoid arthritis 


bipolar disorder 


eslrouen recent or 

pgr 

ERBB2 * 
BRCA1 

EGFR 
CYP19 
TFFi 
PSEN2 
TP53 


REN 

DBF 
LEP 
ACT 
INS 

kalllkreln 
ACE 

endothejtn 
5/0045 
BDK 


TNFRSF1QA 
CRP 

AC 

AS 
ESRi 

HLA-DRBJ 
DRl 

lnterleukln 

TNF 

R6 


ERDAI 
SNAP2S 
PFKL 
PRD2 
TRH 
JMPA2 
HTR3A 
DRD3 
REM 
KCNN3 


C£S3 
CEACAMS 


DIANPff 
SARI 


collagen 
ILIA 


DRD4 
HTR2C 


ERBB3 „ 

cyclin 
COXSA 
cat heps in 
ERBB4 

TRAM 

CCNDl 

EOF 

MUCl 

insulin-like 
BCL2 


PBi 

CD59 
ALB 

CYPUB2 

K4AT9R 

angiotensin 
receptor 

AGTR2 

NPPA 

LVM 

PBH 
NPY 


ACR 

7NFRSFJ2 
JL2 

CHJ3LI 

intedeukin 1 
matrix 

metailoprotelnase 

Interferon 

CD$8 

!L4 
H17 


RELN 

DBH 

MAOA 

com 

HTR2A 
SYNJ1 

INPPI 

NEDD4L 

FRA13C 

transducer of 

ERBB2 

BAIAP3 


mucin 
FGF3 


POMC 
neuropeptide 


MMP3 
SE 


ATP1B3 
DRDS 



atherosclerosis 



apollpoproteln 
APOE 
LDLR 
ELN 
ARGJ 
APOB 
APOA1 
MSRJ 
LPL 
PON! 

plasminogen 
activator inhibitor 
PLC 

vascular cell 
adhesion molecule 
ATOM 
VWF 
JN$ 
ARG2 

ABCA1 

0LR1 
collagen 
MCP 

lipoprotein 
APOA2 

Intercellular 
adhesion molecule 
RAB27A 



Discussion 

The Human Genome Project heralded a new era In biological 
research where the emphasis on understanding specific path- 
ways has expanded to global studies of genomic organization 
and biological systems. High-throughput technologies can 
provide novel Insight Into comprehensive biological function 
but also Introduces new challenges. The utility of these 
technologies ts limited to the ability to generate, analyze, and 
Interpret large gene lists. MedCene, a relational database 
derived by mining the Information in Medline, was created to 
address this need. MedCene users can query for a rank-ordered 
list of human gene-disease relationships (Table 2) for one or 
more diseases. Each entry ts hyperllnked to the original papers 
supporting each association and to other relevant databases. 

MedCene Is an Innovative extension of previous text mining 
approaches. Perez- Iratxeta et al used the CO annotation and 
their chromosomal locations to predict genes that may con- 
tribute to Inherited disorders* MedCene takes a broader view 
and includes all diseases and all possible gene-disease relation- 
ships. Furthermore, MedCene utilizes co- citation to indicate a 
relationship rather than CO annotation, which Is limited to the 
subset of genes that have GO annotation. Our approach Is 
complementary to that taken by Chaussabeland Sher, who 
used the frequency of co-cltcd terms to cluster genes Into a 
hierarchy of gene-gene relationships. 9 

A unique aspect of ihls tool Is the ability to assess the relative 
strengths of gene-disease relationships based on the frequency 
of both co-citation and single citation. This presupposes that 
most co-citations describe a positive association, often referred 
to as publication bias 15 and Is supported by our observations 



that negative associations are rare (Supplemental Table 3: 

http://hlpseq.med.harvard,edu/MedGene/publicatlon/5 Ta- 
ble 3-htmJ) . Of course, relationships established by frequency 
of co-cltation do not necessarily represent a true biological link; 
however, It Is strong evidence to support a true relationship. 

Another important feature of MedCene Is the implement*, 
tion of software filters that substantially reduced the error rate. 
We estimate that less than 1 0% of all associations were missed 
and at least 70% of even the weakest associations were real 
For this study, all of the filters that we applied were general 
ones, e.g. ( expanding the list of all gene names to address the 
different syntax forms used by different Journals, eliminating 
gene names that correspond to common English words, etc. 
The majority of the remaining search term ambiguities were 
Idiosyncratic and difficult to Identify systematically without 
causing a significant rise in false negatives. Alternative ap- 
proaches, such as the examination of the nearest neighbor 
terms, need to be considered to further reduce the false positive 
rate. 

it Is not uncommon to see expression changes In micro- 
array experiments as small as 2-fcld reported In the literature. 
Even when these expression changes are statistically significant 
it Is not always clear If th«y are biologically meaningful. When 
comparing expression levels of disease to normal tissue, one 
expects an enrichment of known dbease-related genes to 
appear In the altered expression group. MedCene provided a 
unique opportunity to test this notion In the context or existing 
knowledge on a novel breast cancer micro-array dataset. For 
genes displaying a 5-fold change or less In tumors compared 
to normal, there was no evidence of a correlation between 
altered gene expression and a known role in the disease. This 
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Tabl* 3. Genes with Large Expression Changes in ER- but 
Not In ER+- Breast Tumors 
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Table 3. MedCcnc (demised a set of relatively uraJerstudled, yet highly 
eiprosed gene* in £R negative, but not ER pes hive breast tumor*, AU of 
these genes have either never been co-cited wllh breast cancer or Itave a 
weak association except those marked with an *. 



reflects the many genes whose role In breast cancer may not 
involve large changes In expression In sporadic tumors (e,g, t 
URCA1 and BRCA2} and genes whose modest changes In 
expression may be unrelated to the disease. Strikingly, among 
genes with a 10-fold change or more In expression level, there 
was a strong and significant correlation between expression 
level and a published role In the disease, providing the first 
global validation or the micro- array approach to Identifying 
disease-specific genes. 

The results derived from MedGenc have two implications. 
Flrsi, a careful hunt for corroborating evidence Of a role In 
breast cancer should precede any further study of genes with 
less than 5-fold expression level changes. Second, any genes 
with 10- fold changes or more arc likely to be related to breast 
cancer and warrant attention. It Is likely that this threshold will 
change depending on the disease as well as the experiment. 

Interestingly, the observed correlation was onry found among 
ER-posltlve tumors, not ER-negatlve. This may reflect a bias 
in the literature to study the more prevalent type of tumor In 
the population. Furthermore, this emphasizes that caution 
must be taken when interpreting experiments that may contain 
subpopulatlons that behave very differently. The MedGene 
approach identified a set of relatively understudied, yet highly 
expressed genes In ER-negatlve tumors that are worthy of 
further examination (Table 3). 
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In conclusion, we have developed an automated 'method of 
summarizing and organizing the vast biomedical literature. To 
our knowledge, the resulting database Is the most comprehen- 
sive and accurate of Its kind. By generating a score that reflects 
the strength of the association. It provides an Important too] 
for the rapid and flexible analysis of large datasets from various 
high-throughput screening experiments. Furthermore, It can 
be used fox selecting subsets of genes for functional studies, 
for building disease-specific arrays, for looking at genes com- 
mon to multiple diseases and various other high- throughput 
applications. In the future, II will be possible to enhance the 
utility of the MedGene database by building links between 
genes and other MeSH terms as well as other biological 
processes and concepts, such as cell division and responses to 
small molecules. 
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assignment of gene Indexes (Supplemental Table 4); a review 
of the results, showing that the resulting disease clusters were 
indeed logical (Supplemental Figure 1); and a review of the 
results showing that among the 505 previously unrelated genes, 
467 were either newly Identified genes or genes that had not 
previously been associated with any disease (Supplemental 
Figure 2). This material Is available free of charge via the 
Internet at http://pubs.acs,org and at the web sites mentioned 
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SECOND DECLARATION OF PAUL POLAKIS, Ph.D. 

I, Paul Polakis, Ph.D., declare and say as follows: 

I am currently employed by Genentech, Inc. where my job title is Staff 
Scientist. 

Since joining Genentech in 1999, one of my primary responsibilities has 
been leading Genentech's Tumor Antigen Project, which is a large research 
project with a primary focus on identifying tumor cell markers rhat find use 
as targets for both the diagnosis and treatment of cancer in humane 

As I stated in my previous Declaration dated May 7, 2004 (attached as 
Exhibit A), ray laboratory has* been employing a variety of techniques, 
including microarray analysis, to identify genes which are differentially 
expressed in human tumor tissue relative to normal human tissue. The 
primary purpose of this research is to identify proteins that are abundantly 
expressed on certain human tumor tissue(s) and that are either (i) not 
expressed, or (ii) expressed at detectably lower levels, on normal tissue(s). 

In the course of our research using microarray analysis, we have identified 
approximately 200 gene transcripts that are present in human tumor tissue 
at significantly higher levels than in normal human tissue. To dare, we 
have successfully generated antibodies that bind to 31 of the tumor antigen 
proteins expressed from these differentially expressed gene transcripts and 
have used these antibodies to quantitatively determine the level of 
production of these tumor antigen proteins in both human tumor tissue and 
normal tissue. We have then quantitatively compared the levels of mRNA 
and protein in both the tumor and normal tissues analyzed. The results of 
these analyses are attached herewith as Exhibit B. In Exhibit B, means 
that the mRNA or protein was detectably overexpressed in the tumor tissue 
relative to normal tissue and means that no detectable overexpression 
was observed in the tumor tissue relative to normal tissue. 

As shown in Exhibit B, of the 31 genes identified as being detectably 
overexpressed in human tumor tissue as compared to normal human tissue 
at the mRNA level . 28 of them (i.e., greater than 90%) are also detectably 
overexpressed in human tumor tissue as compared to normal human tissue 
at the protein level . As such, in the cases where we have been able to 
quantitatively measure both (i) mRNA and (ii) protein levels in both (i) 
tumor tissue and (ii) normal tissue, we have observed that in the vast 
majority of cases, there is a very strong correlation between increases in 
mRNA expression and increases in the level of protein encoded by that 
mRNA. 



6. Based upon my own experience accumulated in more than 20 years of 
research, including the data discussed in paragraphs 4-5 above and my 
knowledge of rhe relevant scientific literature, it is my considered scientific 
opinion that for human genes, an increased level of mRN A in a rumor 
tissue relative to a normal tissue more often than not correlates to a similar 
increase in abundance of the encoded protein in the tumor tissue relative to 
the normal tissue. In fact, it remains a generally accepted working 
assumption in molecular biology that increased mRN A levels are more 
often than not predictive of elevated levels of the encoded protein. In fact, 
an entire industry focusing on the research and development of therapeutic 
antibodies to treat a variety of human diseases, such as cancer, operates on 
this working assumption. 

7. I hereby declare that all statements made herein of my own knowledge are 
true and thar all statements made on information or belief are believed to be 
true, and further that these statements were made with the knowledge that 
willful false statements and the like so made are punishable by fine or 
imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code-and that such willful statements may jeopardize the validity of the 
application or any patent issued thereon. 





Paul Poiakis, Ph.D, 
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EXHIBIT A 



DECLARATION OF PAUL POLAKIS, Ph.D. 
I, Paul Polakis, Ph.D., declare and say as follows: 

1 . I was awarded a Ph.D. by the Department of Biochemistry of the Michigan 
State University in 1984. My scientific Curriculum Vitae is attached to and forms 
part of this Declaration (Exhibit A). 

2. I am currently employed by Genentech, Inc. where ray job title is Staff 
Scientist. Since joining Genentech in 1999, one of ray primary responsibilities has 
been leading Genentech's Tumor Antigen Project, which is a large research project 
with a primary focus on identifying tumor cell markers that find use as targets for 
both the diagnosis and treatment of cancer in humans. 

3. As part of the Tumor Antigen Project, ray laboratory has been analyzing 
differential expression of various genes in tumor cells relative to normal cells. 
The purpose of this research is to identify proteins that are abundarrtly expressed 
on certain tumor cells and that are either (i) not expressed, or (ii) expressed at 
lower levels, on corresponding normal cells. We call such differentially expressed 
proteins "tumor antigen proteins". When such a tumor antigen protein is 
identified, one can produce an antibody that recognizes and binds to that protein. 
Such an antibody finds use in the diagnosis of human cancer and may ultimately 
serve as an effective therapeutic in the treatment of human cancer. 

4. In the course of the research conducted by Genentech's Tumor Antigen 
Project, we have employed a variety of scientific techniques for detecting and 
studying differential gene expression in human tumor cells relative to normal cells, 
at genomic DNA, mRNA and protein levels. An important example of one such 
technique is the well known and widely used technique of microarray analysis 
which has proven to be extremely usefbl for the identification of mRNA molecules 
that are differentially expressed in one tissue or cell type relative to another. In the 
course of our research using microarray analysis, we have identified 
approximately 200 gene transcripts that are present in human tumor cells at 
significantly higher levels than in corresponding normal human cells- To date, we 
have generated antibodies that bind to about 30 of the tumor antigen proteins 
expressed from these differentially expres$ed~gene~transcripts andhavcmsed these- 
antibodies to quantitatively determine the level of production of these tumor 
antigen proteins in both human cancer cells and corresponding normal cells. We 
have then compared the levels of mRNA and protein in both the tumor and normal 
cells analyzed. 

5 From the mRNA and protein expression analyses described in paragraph 4 
above, we have observed that there is a strong correlation between changes in the 
level of mRNA present in any particular cell type and the level of protein 



expressed from that mRNA in that cell type. In approximately 80% of our 
observations we have found that increases in the level of a particular riiRNA 
correlates with changes in the level of protein expressed from that mRNA when 
human tumor cells are compared with their corresponding normal cells. 

6. Based upon my own experience accumulated in more than 20 years of 
research, including the data discussed in paragraphs 4 and 5 above and my 
knowledge of the relevant scientific literature, it is my considered scientific 
opinion that for human genes, an increased level of mRNA in a tumor cell relative 
to a normal cell typically correlates to a similar increase in abundance of the 
encoded protein in the tumor cell relative to the normal cell In fact, it remains a 
central dogma in molecular biology that increased mRNA levels are predictive of 
corresponding increased levels of the encoded protein. While there have been 
published reports of genes for which such a correlation does not exist, it is my 
opinion that such reports are exceptions to the commonly understood general rule 
that increased mRNA levels are predictive of corresponding increased levels of the 

* 

encoded protein. 

7. I hereby declare that all statements made herein of my own knowledge are 
true and that all statements made on information or belief are believed to be true, 
and further that these statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or imprisonment, or both, 
under Section 1001 of Title 18 of the United States Code and that such willful 
statements may jeopardize the validity of the application or any patent issued 
thereon. 
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Tlie DEAD box gene, DDXl, is a putative RNA helicase 
that: is co-amplified with MYCN in a subset of retinoblas- 
toma (RB) and neuroblastoma (NB) tumors and cell 
lines. Although gene amplification usually involves hun- 
dreds to thousands of kilobase pairs of DNA, a number of 
studies suggest that co-amplified genes are only overex- 
pressed if they provide a selective advantage to the cells 
in which they are amplified. Here, we further character- 
ize DDXl by identifying its putative transcription and 
translation initiation sites. We analyze DDX1 protein 
levels in MYCMDDX2-amplifIed NB and RB cell lines 
using polyclonal antibodies specific to DDX1 and show 
that: tliere is a good correlation with DDX1 gene copy 
number, DDX1 transcript levels, and DDX1 protein lev- 
els in all cell lines studied. DDX1 protein is found in both 
the nucleus and cytoplasm of DJDXJ-amplified lines but 
is localized primarily to the nucleus of nonamplified 
cells. Our results indicate that DDX1 may be involved in 
eitlxer the formation or progression of a subset of NB 
and RB tumors and suggest that DDX1 normally plays a 
role in the metabolism of RNAs located in the nucleus of 
the cell. 



DEAD box proteins are a family of putative RNA helicases 
that are characterized by eight conserved amino acid motifs, 
one of which is the ATP hydrolysis motif containing the core 
amino acid sequence DEAD (Asp-Glu-Ala-Asp) (1-3). Over 40 
members of the DEAD box family have been isolated from a 
variety of organisms including bacteria, yeast, insects, amphib- 
ians, mammals, and plants. The prototypic DEAD box protein 
is the translation initiation factor, eukaryotic initiation factor 
4A, which, when combined with eukaryotic initiation factor 4B, 
unwinds double-stranded RNA (4). Other DEAD box proteins, 
such, as p68, Vasa, and An3, can effectively and independently 
destabilize/unwind short RNA duplexes in vitro (5-7). Al- 
though some DEAD box proteins play general roles in cellular 
processes such as translation initiation (eukaryotic initiation 
factor 4A (4)), RNA splicing (PRP5, PRP28, and SPP81 in yeast 
(8-10)), and ribosomal assembly (SrmB in Escherichia coli 
(11)), the function of most DEAD box proteins remains un- 
known. Many of the DEAD box proteins found in higher eu- 
karyotes are tissue- or stage-specific. For example, PL10 
mRNA is expressed only in the male germ line, and its product 
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has been proposed to have a specific role in translational reg- 
ulation during spermatogenesis (12). Vasa and ME31B are 
maternal proteins that may be involved in embryogenesis (13, 
14). p68, found in dividing cells (15), is believed to be required 
for the formation of nucleoli and may also have a function in 
the regulation of cell growth and division (16, 17). Other DEAD 
box proteins are implicated in RNA degradation, mRNA stabil- 
ity, and RNA editing (18-20). 

The human DEAD box protein gene DDX1 1 was identified by 
differential screening of a cDNA library enriched in transcripts 
present in the two RB cell lines Y79 and RB522A (21). The 
longest DDX1 cDNA insert isolated from this library was 2.4 kb 
with an open reading frame from position 1 to 2201, All eight 
conserved motifs characteristic of DEAD box proteins are found 
in the predicted amino acid sequence of DDXl as well as a 
region with homology to the heterogeneous nuclear ribonucle- 
oprotein U, a protein believed to participate in the processing of 
heterogeneous nuclear RNA to mRNA (22, 23). The region of 
homology to heterogeneous nuclear ribonucleoprotein U spans 
128 ammo acids and is located between the first two conserved 
DEAD box protein motifs, la and lb. 

The proto-oncogene MYCN encodes a member of the MYC 
family of transcription factors that bind to an E box element 
(CACGTG) when dimerized with the MAX protein (24, 25). The 
MYCN gene is amplified and overexpressed in approximately 
one-third of all NB tumors (26, 27). Amplification of MYCN is 
associated with rapid tumor progression and a poor clinical 
prognosis (26, 27). MYCN overexpression is usually achieved 
by increasing gene copy number rather than by up-regulating 
basal expression of MYCN (27, 28). Because gene amplification 
involves hundreds to thousands of kilobase pairs of contiguous 
DNA (29-32), it is possible that co-amplification of a gene 
located in proximity to MYCN may contribute to the poor 
clinical prognosis of MYCN- amplified tumors. The DDXl gene 
maps to the same chromosomal band as MYCN, 2p24, and is 
located -400 kb telomeric to the MYCN gene (33-36). All four 
MYCiV-amplified RB tumor cell lines tested to date are ampli- 
fied for DDXl (21), 2 while approximately two-thirds of NB cell 
lines and 38-68% of NB tumors are co-amplified for both genes 
(37-39). George et aL (39) found a significant decrease in the 
mean disease-free survival of patients with DDXl /MYCN-am- 
plified NB tumors compared with MYCiV-amplified tumors. 
Similarly, Squire et aL (38) observed a trend toward a worse 
clinical prognosis when both genes were amplified in the tu- 
mors of NB patients. To date, there have been no reports of a 



1 The abbreviations used are: DDXl, DEAD box 1; NB, neuroblas- 
toma; RB, retinoblastoma; RACE, rapid amplification of cDNA ends* 
PAGE, polyacrylamide gel electrophoresis; nt, nucleotides); MOPS* 
4-morpholinepropanesulfonic acid; bp, base pair(s); kb, kilobase(s) or 
kilobase pair(s). 

2 R. Godbout, unpublished results. 
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tumor amplified only for DDXl, and the role that this gene 
plays in cancer formation and progression is not known. 

Because of the high rate of rearrangements in amplified 
DNA (31, 40), it is unlikely that a gene located -400 kb from 
the MYCN gene will be consistently amplified as an intact unit 
unless its product provides a growth advantage to the cell. 
Based on Southern blot analysis, the DDXl gene extends over 
more than 30 kb, and there are no gross rearrangements of this 
gene in Z>DXZ-amplified tumors (21, 38). Furthermore, there is 
a g-ood correlation between DDX1 transcript levels and gene 
copy number in the tumors analyzed to date. However, we need 
to show that DDX1 protein is overexpressed in DDX1 -amplified 
tumors if we are to entertain the possibility that this protein 
plays a role in the tumorigenic process. Here, we isolate and 
characterize the 5' -end of DDX1 mRNA and extend the DDXl 
cDNA sequence by -300 nt. We identify the predicted initia- 
tion codon of DDX1 and generate antisera that specifically 
recognize DDX1 protein. We analyze levels of DDXl protein in 
both DDJO-amplified and nonamplified RB and NB tumors and 
study the subcellular location of this protein in the cell. 

MATERIALS AND METHODS 

Library Screening — A human fetal brain cDNA library (Stratagene) 
was screened using a 320-bp DNA fragment from the 5 '-end of the 
2.4-kb DDXl cDNA previously described (23). Phagemids containing 
positive inserts were excised from A. ZAP II following the supplier's 
directions. The ends of the cDNA inserts were sequenced using the 
dideoxynucleonde chain termination method with T7 DNA polymerase 
(Amersham Pharmacia Biotech). 

A human placenta genomic library (CLONTECH) was screened with 
the 5'-end of DDX1 cDNA. Positive plaques were purified, and the 
genomic DNA was analyzed using restriction enzymes and Southern 
blotting. i?coRI-digested DNA fragments from these clones were sub- 
cloned into pBluescript-and digested with exonuclease III and mung 
bean nuclease to obtain sequentially deleted clones. The exon/intron 
map of the 5' portion of the DDX1 gene was obtained by comparing the 
sequence of DDX1 cDNA with that of the genomic DNA. 

Rapid Amplification of cDNA Ends (RACE) — We used the Ampli- 
FINDER RACE kit (CLONTECH) to extend the 5'-end of DDX1 cDNA 
Briefly, two ju-g of poly(A)**" RNA isolated from RB522A was reverse 
transcribed at 52 °C using either primer Pi or P3 (Pig. 1A). The RNA 
template was hydrolyzed, and excess primer was removed. A single- 
stranded AmpliFINDER anchor containing an EcoRl site was ligated to 
the 3 '-end of the cDNA using T4 RNA ligase. The cDNA was amplified 
using either primer P2 or P4 (Fig. LA) 1 and AmpliFINDER anchor 
primer. RACE products were cloned into pBluescript. 

Primer Extension — Poly(A) + RNAs were isolated from RB and NB 
cell lines as described previously (21, 38). The 21-nt primers 5'-TTCGT- 
TCTGGGCACCATGTGT-3 ' (primer P4 in Fig. 1A) and 5'-TGGGAC- 
CTAGGGCTTCTGGAC-3 ' (primer P3 in Fig. 1A) were end-labeled with 
[y- 32 P]ATP (3000 Ci/mmol; Mandel Scientific) and T4 polynucleotide 
kinase. Each of the labeled primers was annealed to 2 /ug of poly(A) + 
RNA at 45 °C for 90 min, and the cDNA was extended at 42 °C for 60 
min using avian myeloblastosis virus reverse transcriptase (Promega). 
The primer extension products were heat-denatured and run on a 8% 
polyacrylamide gel containing 7 M urea in IX TBE buffer. A G + A 
sequencing ladder served as the size standard. 

SI Nuclease Protection Assay — The Si nuclease protection assay to 
map the transcription initiation site of DDX1 was performed as de- 
scribed by Favaloro et al (41). The DNA probe was prepared by digest- 
ing genomic DNA spanning the upstream region of DDXl and exon 1 
with Aval, labeling the ends with [y- 32 P]ATP (3000 Ci/mmol) and 
polynucleotide kinase, and removing the label from one of the ends by 
digesting the DNA with Sphl (Fig. 4). The RNA samples were resus- 
pended in a hybridization mixture containing 80% formamide, 40 mM 
PIPES, 400 mM NaCl, 1 mM EDTA, and the heat-denatured Sphl-Aval 
probe labeled at the Aual site. The samples were incubated at 45 °C for 
16 h and digested with 3000 units/ml Si nuclease (Boehringer Mann- 
heim) for 60 min at 37 °C. The samples were precipitated with ethanol; 
resuspended in 80% formaldehyde, TBE buffer, 0.1% bromphenol blue, 
xylene cyanol; denatured at 90 °C for 2 min; and electrophoresed in a 7 
M urea, 8% polyacrylamide gel in TBE buffer. 

Northern and Southern Blot Analysis — Poly(A) + RNAs were isolated 
from RB and NB cell lines as described previously (21, 38). Two ^g of 



oblastoma and Neuroblastoma 

poly(Ar RN A/lane were electrophoresed in a 6% formaldehyde, 1.5% 
agarose gel in MOPS buffer (20 mM MOPS, 5 mM sodium acetate, 1 mM 
EDTA, pH 7.0) and transferred to nitrocellulose filter in 3 M sodium 
chloride, 0.3 M sodium citrate. The niters were hybridized to the follow- 
ing DNA probes, 32 P-labeled by nick translation: (i) a 1.6-kb £coRI 
insert from DDX1 cDNA clone 1042 (21), (ii) a 260-bp cDNA fragment 
spanning the 3 '-end of DDX1 exon 1 as well as exons 2 and 3, (iii) a 
160-bp fragment derived from the 5 '-end of DDX1 exon 1, and (iv) 
cE-actin cDNA to control for lane to lane variation in RNA levels. Filters 
were hybridized and washed under high stringency. Southern blot 
analysis was as described previously (21). 

Preparation ofAnti-DDXl Antiserum — To prepare antiserum to the 
C terminus of the DDXl protein, we inserted a 1.8-kb EcoRI fragment 
from bp 848 to 2668 of DDXl cDNA (Fig. IB) into £coRI-digested 
pMAL-c2 expression vector (New England Biolabs). DH5a cells trans- 
formed with this vector were grown to mid-log phase and induced with 
0.1 mM isopropyl-l-thio-/3-D-thiogalactoside. The cells were harvested 

3- 4 h postinduction and lysed by sonication. Soluble maltose binding 
protein-DDXl fusion protein was affinity-purified using amylose resin, 
and the maltose-binding protein was cleaved with factor Xa. The DDXl 
protein was purified on a SDS-PAGE gel, electroeluted, and concen- 
trated. Approximately 100 jig of protein was injected into rabbits at 

4- 6-week intervals. For the initial injection, the protein was dispersed 
in complete Freund's adjuvant (Sigma), while subsequent injections 
were prepared in Freund's incomplete adjuvant. Blood was collected 
from each rabbit 10 days after injection, and the specificity of the 
antiserum was tested using cell extracts from RB522A. To prepare 
antiserum to the N terminus of DDXl protein, a DDXl cDNA fragment 
from bp 268 to 851 (Fig. LB) was inserted into pGEX-4T2 (Amersham 
Pharmacia Biotech). The recombinant protein produced from this con- 
struct contains the first 186 amino acids of the predicted DDXl se- 
quence. Soluble glutathione S-transferase-DDXl fusion protein was 
purified with glutathione-Sepharose 4B (Amersham Pharmacia Bio- 
tech). The glutathione 5-transferase component of the fusion protein 
was cleaved with thrombin. 

Subcellular Fractionations and Western Blot Analysis — We used two 
different procedures for subcellular fractionations. First, we isolated 
nuclear and S100 (soluble cytoplasmic) fractions from RB522A, IMR-32, 
Y79, RB(E)-2, HeLa, and HL60 using the procedure of Dignam (42). On 
average, we obtained 5-6 times more protein in the cytosolic fractions 
than in the nuclear fractions. Second, 10 s RB522A cells were lysed and 
fractionated into S4 (soluble cytoplasmic components), P2 (heavy mito- 
chondria, plasma membrane fragments), P3 (mitochondria, lysozymes, 
peroxisomes, and Golgi membranes), and P4 fractions (membrane ves- 
icles from rough and smooth endoplasmic reticulum, Golgi, and plasma 
membrane) by differential centrifugation (43). We obtained 8 mg of 
protein in the S4 fraction, 1 mg in P2, 0.5 mg in P3, and 2 mg in P4 
fraction. The procedures related to the immunoelectron microscopy 
have been previously described (44). 

For Western blot analysis, proteins were electrophoresed in poly- 
acrylamide-SDS gels and electroblotted onto nitrocellulose using the 
standard protocol for protein transfer described by Schleicher and 
Schuell. The filters were incubated with a 1:5000 dilution of DDXl 
antiserum, a 1:200 dilution of anti-MYCN monoclonal antibody (Boeh- 
ringer Mannheim), or a 1:200 dilution of anti-actin (Santa Cruz Bio- 
technology, Inc., Santa Cruz, CA). For the colorimetric analysis, anti- 
gen-antibody interactions were visualized using either alkaline 
phosphatase-linked goat anti-rabbit IgG (for DDXl) or goat anti-mouse 
IgG (for MYCN) at a 1:3000 dilution. For the ECL Western blotting 
analysis (Amersham Pharmacia Biotech), we used a 1:100,000 dilution 
of peroxidase-linked secondary anti-rabbit IgG antibody (for DDXl) or 
secondary anti-goat IgG antibody (Jackson ImmunoResearch 
Laboratories). 

RESULTS 

Identification of the 5' -End of the DDXl Transcript— We 
have previously reported the sequence of DDXl cDNA isolated 
from an RB cDNA library (21, 23). This 2.4-kb DDXl cDNA 
contains an open reading frame spanning positions 1-2201 
with a methionine encoded by the first three nucleotides (Fig. 
1A). There is a polyadenylation signal and poly(A) tail in the 
3 '-untranslated region, indicating that the sequence is com- 
plete at the 3'-end. Manohar et al. (37) have also isolated DDXl 
cDNA from the NB cell line LA-N-5. Their cDNA extended the 
5'-end of our sequence by 42 bp and included an additional in 
frame methionine (double underlined in Pig. LA). The possibH- 
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FlG. 1. Partial sequence and structure ofDDXl cDNA. A, the 
sequence of the 5'-end of DDXl cDNA. The sequence in boldface type 
starting at the asterisk was obtained using the RACE strategy. The 
additional 6 bp in italic boldface type at the 5 '-end of the cDNA are 
predicted based on the known DDXl genomic sequence and primer 
extension analysis. PI, P2, P3, and P4 are primers used in the RACE 
experiments (the complementary sequence was used in each case). 
Primers P3 and P4 were also used for the primer extension analysis. 
Three in frame methionine codons are indicated by the double under- 
line. An in frame stop codon is indicated by the boldface double under- 
line. The three major transcription initiation sites identified by primer 
extension are indicated by the single arrows, while a minor site is 
represented by the broad arrow. The predicted DDXl transcription 
initiation sites obtained by RACE, SI nuclease, and primer extension 
are indicated as well as the 5'-ends of DDXl cDNA sequences obtained 
by screening cDNA libraries. The sequences transcribed from exons 1, 
2, and 3 are also shown. £, the structure of the 2711-bp DDXl cDNA is 
shown with an open reading frame from position 295 to 2515. 

ity of additional in frame methionines located further upstream 
could not be excluded, because there were no predicted stop 
codons in the upstream region of the cDNA. 

Northern blot analysis indicated a DDXl transcript size of 
-28O0 nt, suggesting that the DDXl cDNAs isolated to date 
were lacking -300-350 bp of 5' sequence. We have used dif- 
ferent approaches to identify the transcription start site of 
DDXl, First, we exhaustively screened a commercial fetal 
brain cDNA library with the 5'-end of DDXl cDNA. Although 
numerous clones were analyzed, only one extended the se- 
quence (by 35 bp) beyond that published by Manohar et al. (37) 
(Fig. 1A). 

We next used the RACE procedure in an attempt to isolate 
additional 5' sequence. The nested primers used to amplify the 
5'-end of the DDXl transcript are labeled as primers PI and P2 
in Fig. LA and are located downstream of the three in frame 
methionines (double underlined in Fig. LA). Poly(A) + RNA 
from RB522A was reverse transcribed at 52 °C using primer 
PI, and the reverse transcribed cDNA was amplified using the 
nested primer P2 and the 5 '-RACE primer. Using this ap- 
proach, we generated a product that was 230 bp longer than 
any of the cDNAs obtained by screening libraries (Fig. LA). 
Sequencing of this 230-bp cDNA revealed an in frame stop 
codon (boldface double underline in Fig. LA) located 123 bp 
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FlG. 2. Identification of the 5 -end of the DDXl transcript by 
primer extension. Radioactively labeled primer P4 was annealed to 2 
l*g of poly(A) + RNA from RB522A (lane i), 1 M g of poly(A) + RNA from 
RB522A (lane 2), and 2 of polyCAT RNA from RB(E)-2 cells (lane 3\ 
and extended using reverse transcriptase. The products were run on an 
8% denaturing polyacrylamide gel with a G + A sequencing ladder as 
size marker. The primer extension products are indicated on the left. 
The sizes of the products (in nt) are presented as the distance from 
primer P4. 

upstream of the predicted translation initiation site. We then 
prepared primers P3 and P4, located near the 5'-end of the 
RACE cDNA (Fig. LA) and repeated the RACE procedure to see 
if additional 5' sequences could be obtained. The resulting 
RACE products did not extend the DDXl cDNA sequence 
further. 

The location of the DDXl transcription initiation site was 
verified by primer extension. Poly(A) + RNA was prepared from 
the following two cell lines: DDXJ-amplified RB cell line 
RJB522A and a nonamplified RB cell line RB(E)-2. RB522A has 
elevated levels of DDXl mRNA, while RB(E)-2 has at least 
20-fold lower levels of this transcript. Three products of 40 43 
and 46 nt (with a weak signal at 45 nt) were detected in 
RB522A using primer P4 (Figs. 1A and 2). The 40-nt product 
corresponded exactly with the 5'-end of the RACE-derived 
cDNA while the 43- and 46-nt products extended the predicted 
size of the DDXl transcript by 3 and 6 nt, respectively. None of 
these products were observed in RB(E)-2. Bands of identical 
sizes to those obtained with RB522A mRNA were also observed 
in the DDXl -amplified NB cell line BE(2)-C but not in the 
DDX2 -amplified NB cell line IMR-32 (data not shown). The 
same predicted DDXl transcription initiation site was identi- 
fied with primer P3 except that the bands were of weaker 
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intensity (data not shown). We have designated the transcrip- 
tion start site identified by primer extension as +1 (Fig. 1A). 

The sequence of the 6 nt extending beyond the RACE cDNA 
was obtained by comparison of the cDNA sequence with that of 
DDX1 genomic DNA. Bacteriophages containing BDX1 
genomic DNA were isolated by screening a human placenta 
library with 5' BDX1 cDNA. Eighteen kb of DNA were se- 
quenced from two bacteriophages with overlapping DDX1 
genomic DNA. Thirteen exons were identified within this 18-kb 
region (Fig. 3) corresponding to cDNA sequences from position 

1 to 1249. The 310-bp exon 1 was by far the longest of the 13 
exons sequenced, corresponding to the entire 5 '-untranslated 
region of DDX1 as well as the first in frame methionine. The 
sequences transcribed from exons 1, 2, and 3 are indicated in 
Fig. 1A. 

Knowledge of the genomic structure of DDX1 allowed us to 
use the SI protection assay, a technique that is independent of 
reverse transcriptase, to further define the 5'-end of the DDX1 
transcript. Poly(A) + RNAs from six Z)Z)Xl-amplified lines (RB 
lines: Y79 and RB522A; NB lines: BE(2)-C, IMR-32, LA-N-1, 
and LA-N-5) and six nonamplified lines (RB lines: RB(E)-2 and 
RB412; NB lines, GOTO, NB-1, NUB-7, and SK-N-MC) were 
hybridized to a DNA probe that extended from position -745 in 
the 5 '-flanking DDX1 DNA to position +164 in exon 1. This 
DNA probe was labeled at position +164 as indicated in Fig. 4. 
Nonhybridized DNA was digested with SI nuclease, and the 
sizes of the protected fragments were analyzed on a denaturing 
polyacrylamide gel. Bands of 150-153 nt were observed in lane 

2 (RB522A), lane 5 (BE(2)-C), and lane 8 (LA-N-1) with bands 
of much weaker intensity in lane 7 (IMR-32) (Fig. 4). Specific 
bands were not detected in either DDX1 -amplified Y79 and 
LA-N-5 or the nonamplified lines. Although the sizes of the SI 
protected bands in RB522A, BE(2)-C, and LA-N-1 were 5 and 
11 nt shorter than predicted based on RACE and primer ex- 
tensions, respectively, there was general agreement with all 
three techniques regarding the location of the DBX1 transcrip- 
tion initiation site (Fig. LA). The smaller SI nuclease protected 
products could have arisen as the result of SI digestion of the 
5' -end of the RNADNA heteroduplex because of its relatively 
high rU:dA content (45). 

Identification of the same transcription initiation site in 
three Z?Z>-X"i-amplified lines suggests that this represents the 
bona fide start site of DDX1 transcription. However, it was not 
clear why this start site was either very weak or not detected in 
three other amplified lines. To determine whether the 5 '-end of 
exon 1 is transcribed in all DDX1 -amplified lines, we carried 
out a direct analysis of the 5 '-end of the DDX1 transcript by 
Northern blotting. Two probes were used for this analysis: the 
5' probe contained a 160-bp fragment from bp 1 to 160 (5 '-half 
of exon 1), and the 3' probe contained a 260-bp fragment from 
bp 160 to 420 (3 '-half of exon 1 as well exons 2 and 3) (Fig. 1A). 
With the 3' probe, we obtained bands of similar size and inten- 
sity in four DDX1 -amplified lines (RB522A, BE(2)-C, IMR-32, 
and LA-N-5). Band intensity was somewhat weaker in Y79 and 
stronger in LA-N-1 in comparison with the other lines (Fig. 5), 
No signal was detected in the hotl-BDXI -amplified line RB412. 
With the 5' probe, a relatively strong signal was observed in 
RB522A, BE(2)-C, and LA-N-1, while a considerably weaker 



but readily apparent signal was detected in Y79, IMR-32, and 
LA-N-5. The signal obtained with actin indicates that, with the 
exception of LA-N-1, similar amounts of RNA were loaded in 
each lane and that the RNA was not degraded. These results 
indicate that at least a portion of the 160-bp 5 '-end of exon 1 is 
transcribed in all DZ)Xi-amplified lines. 

Based on primer extension, SI nuclease protection assay, 
Northern blot analysis and the sequencing of the RACE prod- 
ucts, we conclude that the DDX1 transcript is 2.7 kb with an 
open reading frame spanning nucleotides 295-2515 encoding a 
predicted protein of 740 amino acids with an estimated molec- 
ular weight of 82.4 (Fig. LB). An in frame stop codon is located 
123 nt upstream of the predicted translation initiation site, at 
positions 172-174. The first in frame methionine following the 
stop codon is in agreement with the Kozak consensus sequence 
(46). Furthermore, the predicted start methionine codon for 
human DDX1 corresponds perfectly with that of Drosophila 
DDX1 (47). A stop codon is located 15 nt upstream of the 
initiation codon in Drosophila DDX1. 

Analysis - of DDX1 Protein Levels in Neuroblastoma and Ret- 
inoblastoma — We and others have previously shown that there 
is a good correlation between gene copy number and RNA levels 
inDZ)X2-amplified RB and NB cell lines (37, 38). To determine 
whether the correlation extends to DDX1 protein levels, we 
prepared antiserum to two nonoverlapping recombinant DDX1 
proteins. First, we prepared a C terminus recombinant protein 
construct by inserting a 1.8-kb EcoBl fragment from bp 848 to 
2668 (amino acids 185-740) (Fig. LB) into the pMAL-c2 expres- 
sion vector. Recombinant protein expression was induced with 
isopropyl-l-thio-/3-D-thiogalactoside, and the 110-kDa maltose- 
binding protein-DDXl fusion product was purified by affinity 
chromatography using amylose resin, followed by electrophore- 
sis on a SDS-PAGE gel after cleaving the maltose-binding 
protein fusion partner with factor Xa. Second, we prepared an 
N terminus construct by ligating a DNA fragment from bp 268 
to 851 (amino acids 1-186) into pGEX-4T2. The 50-kDa gluta- 
thione S-transferase-DDXl fusion protein was purified by af- 
finity chromatography on a glutathione column. This N termi- 
nus fusion protein contains only the first of the eight conserved 
motifs found in all DEAD box proteins, while the C terminus 
fusion protein includes the remaining seven motifs. 

We measured DDX1 protein levels in total cell extracts of 
three RB and 10 NB cell lines. Using antiserum to the N 
terminus fusion protein, we observed a strong signal in all 
DDX1 -amplified cell lines: the RB cell lines Y79 (lane 1) and 
RB522A (lane 2) and the NB cell lines BE(2)-C (lane 4\ IMR-32 
(lane 6), LA-N-1 (lane 8), and LA-N-5 (lane 9) (Fig. 6). Two 
bands were observed in the majority of extracts. Of the ampli- 
fied lines, Y79 produced the weakest signal, with the most 
intense signal observed in LA-N-1. There was an excellent 
correlation with DDX1 protein and mRNA levels in these cell 
lines, with lower levels of DDX1 mRNA observed in Y79 and 
higher levels in LA-N-1 (Fig. 7A). As shown in Fig. 72?, this 
correlation extended to DDX1 gene copy number. No gross 
DNA rearrangements were seen in the BBX1 -amplified lines; 
however, three small bands of altered size were observed in the 
RB412 lane. Although the nature of the DNA alteration is not 
known, it is noteworthy that BDX1 transcript levels in RB412 
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Fig. 4. SI nuclease mapping of the 
5'-end of theDDXl transcript. Two /ig 
of poly(A) + RNA from four RB lines 
(DDXl- amplified Y79 and RB522A and 
nonamplified RB(E)-2 and RB412), eight 
NB lines (DD^i-amplified BE(2)-C, IMR- 
32, LA-N-1, and LA-N-5 and nonamplified 
GOTO, NB-1, NUB-7, and SK-N-MC), 
and tRNA as a negative control were hy- 
bridized to a Sphl-Aval fragment labeled 
at the Aval site with [y- 32 P]ATP and 
polynucleotide kinase. Bands of 150-153 
nt are shown in lanes 2 (RB522A), 5 
(BE(2)-C), and 8 (LA-N-1) with much 
weaker bands in lane 7 (IMR-32). A map 
of the probe indicating the transcription 
initiation site identified by primer exten- 
sion ( + 2), the labeling site (*), and exons 
1 and 2, is shown at the bottom. 
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are extremely low (Fig. 1A) and that the top DDX1 protein band 
in RB412 cell extracts is smaller in size than the top band from 
the other cell extracts (Fig. 6). 

Two DDX1 protein bands were present in most of the lanes in 
Fig. 6. The same two bands were detected with antiserum to 
the C terminus of the DDX1 protein, as well as a third band at 
—60 kDa (data not shown). There was no variation in the 
intensity of the 60-kDa band in DDXl -amplified and nonam- 
plified cell extracts. The 60-kDa band probably represents an- 
other member of the DEAD box protein family, because the C 
terminus DDX1 protein used to prepare this antiserum con- 
tained seven of the eight conserved motifs found in all DEAD 
box proteins. To obtain an estimate of the size of the two DDX1 
bands, we ran cellular extracts from RB522A on a 7% SDS- 
PAGE gel with the BenchMark protein ladder (Life Technolo- 
gies, Inc.). The size of the DDX1 protein was determined using 
the Alpha Imager 2000 documentation and analysis system for 
molecular weight calculation. Based on this analysis, the esti- 
mated molecular mass of the top band is 89.5 kDa, while that 
of the bottom band is 83,5 kDa. The 84-kDa band may repre- 
sent the unmodified product encoded by the DDX1 transcript 
(capable of encoding a protein with a predicted molecular mass 
of 82.4 kDa), while the top band may represent post-transla- 
tional modification of DDX1 protein (e.g, phosphorylation). An- 
other possibility is that the top band represents intact DDX1 




Fig. 5. Northern blot analysis of the 5 '-end of the DDX1 tran- 
script. Two /ig of poly(AT RNA isolated from DOTJ-ampIified Y79 
RB522A, BE(2)-C, IMR-32, LA-N-1, and LA-N-5 and nonamplified 
RB412 were electrophoresed in a 1.5% agarose-formaldehyde gel. The 
RNA was transferred to a nitrocellulose filter and sequentially hybrid- 
ized with a 260-bp fragment from DDX1 cDNA from bp +160 to +420 
(3'-end of exon 1 as well as exons 2 and 3) (A), a 160-bp fragment from 
DDX1 cDNA from bp + 1 to + 160 (5'-end of exon 1) (B), and actin cDNA 
(O. The DNA was labeled with P^ldCTP by nick translation. The blots 
were hybridized and washed under high stringency. 
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Fig. 6. DDXl protein expression in RB and NB cell lines. West- 
ern blots were prepared using total cellular extracts from three RB 
(Y79, RB522A, and RB412) and 10 NB cell lines (BE(2)-C, GOTO 
IMR-32, KAN, LA-N-1, LA-N-5, NB-1, NUB-7, SK-N-MC, and SK-N- 
SH). Ttie lines that are amplified for the DDXl gene are Y79, RB522A, 
BE(2)-C, IMR-32, LA-N-1, and LA-N-5. Twenty of protein were 
loaded in each lane and electrophoresed in a 10% SDS-PAGE gel. DDXl 
was detected using a 1:5000 dilution of the antiserum to the amino 
terminus of DDXl protein. Size markers in kilodaltons are indicated on 
the side. 
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Fig. 8. Distribution of DDXl in the nucleus and cytoplasm. A, 

cytosolic and nuclear extracts were prepared from RB522A and electro- 
phoresed in a 7% SDS-PAGE gel. Cytosolic extracts were loaded in 
lanes 1 (20 yug of protein) and 2 (10 jig), while nuclear extracts were 
loaded in lanes 3 (10 yLg) and 4 (20 jig). DDXl was visualized using a 
1:5000 dilution of the antiserum to the N terminus. The BenchMark 
protein ladder size markers (kilodaltons) are indicated on the left, B, 
cytosolic and nuclear extracts were prepared from HL60, Y79, IMR-32* 
HeLa, RB522A, and RB(E)-2 and electrophoresed in an 8% SDS-PAGE 
gel. Twenty ^xg of proteins were loaded in each lane marked C (cytosolic) 
and N (nuclear). DDXl was visualized using a 1:5000 dilution of the 
antiserum to the N terminus. Actin levels were analyzed using a 1:200 
dilution of anti- actin antibody (Santa Cruz Biotechnology). 




Fig. 7. Northern and Southern blot analyses of DDXl in RB 
and NB cell lines. A, 2 yjg of polyCA)* RNA were loaded in each lane, 
electrophoresed in a 1.5% agarose-formaldehyde gel, and transferred to 
a nitrocellulose filter. The filter was first hybridized to a S2 P-labeled 
1.6-kb DDXl cDNA (clone 1042) (21), stripped, and rehybridized to 
actin DNA. B, 10 /ig of genomic DNA from each of the indicated cell 
lines were digested with EcoBl, electrophoresed in a 1% agarose gel, 
and transferred to a nitroceUulose filter. The filter was hybridized to 
32 P-labeled clone 1042 DDXl cDNA, stripped, and reprobed with la- 
beled a-fetoprotein cDNA. Markers (in kilobase pairs) are indicated on 
the side. 

and the lower band is a specific truncated or degradation prod- 
uct of DDXl. Yet a third possibility is that the two bands 
represent the products of differentially spliced transcripts or 



different translation initiation sites. However, the lack of any 
obvious differences in DDXl transcript sizes in the three RB 
and 10 NB lines analyzed in Fig. 7A does not support the latter 
possibility (e.g. compare the DDXl transcript size in NUB-7 
(which produces the lower DDXl protein band) and in NB-1 
(which produces the higher DDXl protein band)). 

Subcellular Localization of DDXl Protein — DEAD box pro- 
teins have been implicated in a variety of cellular functions 
including RNA splicing in the nucleus, translation initiation in 
the cytoplasm, and ribosome assembly in the nucleolus. To 
obtain an indication of the possible role of DDXl, we studied its 
subcellular location. Nuclear and cytosolic extracts were pre- 
pared from DDZJ-amphfied RB522A and run on a 7% SDS- 
PAGE gel. Although there was more DDXl protein in the 
cytosol than in the nucleus on a per cell basis, the proportion of 
DDXl protein relative to total protein was similar in both 
cellular compartments (Fig. 8A). Both the 90- and 84-kDa 
bands were present in cytosol and nuclear extracts, although 
the bottom band was more readily apparent in the cytosol. By 
running the gel for an extended period of time (twice as long as 
usual), we were able to detect an additional weak band at —88 
kDa in both nuclear and cytosolic extracts. 

To determine whether DDXl consistently localizes to both 
the cytoplasm and nucleus, we prepared cytosol and nuclear 
extracts from two additional DDXl -amplified lines, Y79 and 
IMR-32, as well as from nonamplified RB(E)-2, HL60, and 
HeLa. DDXl protein was found in both the nucleus and cyto- 
plasm of IMR-32, primarily in the cytoplasm of Y79, and 
mainly in the nucleus of the three nonamplified lines (Fig SB) ■ 
In addition, DDXl was almost exclusively found in nuclear 
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Fig. 9. Subcellular location of DDXl protein. RB522A cells were 
fractionated into nuclear (lane 2), S100 and S4 cytosol (lanes 2 and 5), 
P2 membrane (lane 4), P3 membrane (lane 5), and P4 membrane (lane 
6) fractions. Twenty ^.g of protein were loaded in each lane and run on 
a 10% SDS-PAGE gel. A, DDXl protein was detected using a 1:5000 
dilution of the antiserum to the N terminus of DDXl. B, MYCN protein 
was detected using a commercially available antibody at a 1:200 dilu- 
tion. Size markers Qdlodaltons) are indicated on the side. 

extracts prepared from normal GM38 fibroblasts (data not 
shown). We used anti-actin antibody to ensure that our nuclear 
and cytosolic extracts were not cross-contaminated (Fig. SB). 

We next carried out a more detailed analysis of DDXl sub- 
cellular location using two different approaches: (i) fraction- 
ation of cellular components into nuclei; S100 or S4 cytosol 
(containing soluble cytoplasmic components, including 40 S 
ribosomes); P2 (heavy mitochondria, plasma membrane frag- 
ments plus material trapped by these membranes); P3 (mito- 
chondria, lysosomes, peroxisomes, Golgi membranes, some 
rough endoplasmic reticulum); and P4 (microsomes from 
smooth and rough endoplasmic reticulum, Golgi and plasma 
membranes) (43); and (ii) immunogold electron microscopy. 
The DDXl -amplified RB522A cell line was used for both exper- 
iments. The fractionation procedures indicate that DDXl is 
mainly in the nucleus and in the cytosol (S4 and S 100 fractions) 
of RB522A cells (Fig. 9A). As a control, we used anti-human 
MYCN antibody to determine the location of MYCN (also am- 
plified in RB522A) in our subcellular fractions. As shown in 
Fig. 9B, MYCN was primarily found in the nucleus, as one 
would expect of a transcription factor. 

For the electron microscopy analysis, antiserum to the N 
terminus of DDXl was coupled to protein A gold particles, and 
the distribution of DDXl was examined in RB522A cells fixed 
in paraformaldehyde and glutaraldehyde. DDXl was present 
in both the cytoplasm and nucleus (data not shown). There was 
no association with either cell organelles or with nuclear or 
plasma membranes. 

DISCUSSION 

There are presently few clues as to the function of DDXl in 
normal and cancer cells. Our earlier data indicate that DDXl 
mRNA is present at higher levels in fetal tissues of neural 
origin (retina and brain) compared with other fetal tissues (21). 



There may therefore be a requirement for elevated levels of this 
putative RNA helicase for the efficient production or processing 
of neural specific transcripts. A role in cancer formation or 
progression is an intriguing possibility, because overexpression 
of an RNA unwinding protein could affect the secondary struc- 
ture of RNAs in such a way as to alter the expression of specific 
proteins in tumor cells. DDXl is co-amplified with MYCN in a 
subset of KB and NB cell lines and tumors (37-39). MYCN 
amplification is common in stage IV NB tumors and is a well 
documented indicator of poor prognosis. A general trend to- 
ward a poorer clinical prognosis is observed when both the 
MYCN and DDXl genes are amplified compared with when 
only MYCN is amplified (38, 39), suggesting a possible role for 
DDXl in NB tumor formation or progression. 

It is generally accepted that co-amplified genes are not over- 
expressed unless they provide a selective growth advantage to 
the cell (48, 49). For example, although ERBA is closely linked 
to ERBB2 in breast cancer and both genes are commonly am- 
plified in these tumors, ERBA is not overexpressed (48). Sim- 
ilarly, three genes mapping to 12ql3-14 (CDK4, SAS, and 
MDM2) are overexpressed in a high percentage of malignant 
gliomas showing amplification of this chromosomal region, 
while other genes mapping to this region (GADD153, GLI, and 
A2MR) are rarely overexpressed in gene-amplified malignant 
gliomas (50, 51). The first three genes are probably the main 
targets of the amplification process, while the latter three 
genes are probably incidentally included in the amplicons. The 
data shown here indicate that DDXl is overexpressed at both 
the protein and RNA levels in DDXZ-amplified KB and NB cell 
lines and that there is a strong correlation between DDXl gene 
copy number, DDXl RNA levels, and DDXl protein levels in 
these lines. Our results are therefore consistent with DDXl 
overexpression playing a positive role in some aspect of NB and 
KB tumor formation or progression. Recently, Weiss et al. (52) 
have shown that transgenic mice that overexpress MYCN de- 
velop NB tumors several months after birth. They conclude 
that MYCN overexpression can contribute to the initiation of 
tumorigenesis but that additional events are required for tu- 
mor formation. Amplification of DDXl may represent one of 
many alternative pathways by which a normal precursor "neu- 
roblast" or "retinoblast" cell gains malignant properties. 

The function of the majority of tissue-specific or developmen- 
tally regulated DEAD box genes remains unknown. However, 
some members of this protein family have been either directly 
or mdirectly implicated in tumorigenesis. For example, the p68 
gene has been found to be mutated in the ultraviolet light- 
induced murine tumor 8101 (53), while DDX6 (also known as 
RCK or p54) is encoded by a gene located at the breakpoint of 
the translocation involving chromosomes 11 and 14 in a cell 
line derived from a B-cell lymphoma (54, 55). Similarly, the 
production of a chimeric protein between DDX10 and the 
nucleoporin gene NUP98 has been proposed to be involved in 
the pathogenesis of a subset of myeloid malignancies with 
inv(ll) (pl5q22) (56). Interestingly, Grandori et al, (57) have 
shown that MYCC interacts with a DEAD box gene called 
MrDb, suggesting that the transcription of some DEAD box 
genes could be regulated through interaction with members of 
the MYC family. Future work will involve deterrnining whether 
DDXl represents another member of the DEAD box family 
with a role in the tumorigenic process. 

DEAD box proteins have been implicated in translation ini- 
tiation, RNA splicing, RNA degradation, and RNA stability (3, 
18, 19). We carried out subcellular localization studies in an 
attempt to obtain a general indication of the function of DDXl. 
We found DDXl protein in both the cytoplasm and nucleus of 
DDXl- amplified NB and RB lines. In contrast, DDXl was 
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mainly located in the nucleus of nonamplified cell lines and 
normal fibroblast cultures. DDX1 was not associated with cel- 
lular organelles or with membranes based on immunoelectron 
microscopy. We therefore propose that the primary role of 
DDX1 is in the nucleus. The presence of DDX1 in the cytoplasm 
of DDX1- amplified cells may indicate that the amount of DDX1 
protein, that is allowed in the nucleus is tightly regulated. 
Alternatively, DDX1 may play a dual role in the nucleus and 
cytoplasm of DDX1 -amplified cells. 

An important component of our analysis was to identify the 
translation and transcription initiation sites of DDX1. We used 
a combination of techniques to identify the transcription start 
site: screening of RB and fetal brain libraries, RACE/primer 
extension, genomic DNA sequencing, SI nuclease mapping, 
and Northern blot analysis using probes to the predicted 5'-end 
of the transcript. The transcription start site identified using 
these techniques is located —300 nt upstream of the predicted 
translation initiation codon and was readily detected in three 
DDXl-sooapUfied lines and barely detectable in a fourth ampli- 
fied line. The 5 '-untranslated region as well as the first in 
frame methionine are encoded within the first exon of DDX1. 
An in frame stop codon is located 123 nt upstream of the 
predicted initiation codon. We were unable to identify the tran- 
scription initiation site of DDX1 in two of the six amplified lines 
tested as well as in nonamplified lines. Although it remains 
possible that there are different transcription start sites in 
different cell lines, detection of lower levels (rather than the 
absence) of the 5 '-most 160 nt of the DDX1 transcript in IMR- 
32, Y79, and LA-N-5 compared with RB522A, BE(2)-C, and 
LA-N-1 supports a quantitative rather than a qualitative dif- 
ference in the 5 '-end of this transcript in these cells. Our 
results suggest that the 5 '-end oiBDXl mRNA is rarely intact, 
even in mRNA preparations that otherwise appear to be of high 
quality based on analysis of control transcripts. The 5 '-end of 
DDX1 mRNA may therefore be especially susceptible to degra- 
dation, perhaps because of its sequence and/or secondary 
structure. 

In conclusion, we have mapped the 5 '-end of the 2.7-kb DDX1 
transcript and have identified the predicted translation initia- 
tion site of DDX1 protein. We have found that DDXi-amplified 
RB and NB tumor lines overexpress DDX1 protein and that 
there is a good correlation between gene copy number and both 
transcript and protein levels in these cells. We have shown that 
DDX1 protein is primarily located in the nucleus of cells that 
are not DBX1 -amplified. In contrast, DDX1 is present in both 
the nucleus and cytoplasm of DDX1- amplified NB and RB 
lines. A cytoplasmic location in DDX1 -amplified lines may in- 
dicate that the amount of nuclear DDX1 is tightly regulated or 
that DDX1 plays a dual role in the cytoplasm and nucleus of 
these cells. 
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Amplification and overexpression of putative oncogenes 
confer growth advantages for tumor development. We 
used a functional genomic approach that integrated 
simultaneous genomic and transcript microarray, proteo- 
mics, and tissue microarray analyses to directly identify 
putative oncogenes in lung adenocarcinoma. We first 
identified 183 genes with increases in both genomic copy 
number and transcript in six lung adenocarcinoma cell 
lines. Next, we used two-dimensional polyacrylamide gel 
electrophoresis and mass spectrometry to identify 42 
proteins that were overexpressed in the cancer cells 
relative to normal cells. Comparing the 183 genes with 
the 42 proteins, we identified four genes - PRDXl y 
EEF1A2, CALR, and KCIP-1 - in which elevated protein 
expression correlated with both increased DNA copy 
number and increased transcript levels (all r>0.84, two- 
sided P<0.05). These findings were validated by South- 
ern, Northern, and Western blotting. Specific inhibition of 
EEF1A2 and KCIP-1 expression with siRNA in the four 
cell lines tested suppressed proliferation and induced 
apoptosis. Parallel fluorescence in situ hybridization and 
immunohistochemical analyses of EEF1A2 and KCIP-1 in 
tissue microarrays from patients with lung adenocarcinoma 
showed that gene amplification was associated with high 
protein expression for both genes and that protein 
overexpression was related to rumor grade, disease stage, 
Ki-67 expression, and a shorter survival of patients. The 
amplification of EEF1A2 and KCIP-1 and the presence of 
overexpressed protein in tumor samples strongly suggest 
that these genes could be oncogenes and hence potential 
targets for diagnosis and therapy in lung adenocarcinoma. 
Oncogene (200Q 25, 2628-2635. doi:10J038/sj.onc.l209289; 
published online 12 December 2005 
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Introduction 

In lung adenocarcinoma, as in other types of cancer, 
gene amplification and the consequent overexpression of 
the amplified oncogene play an important role in the 
development of tumors, because their overexpression 
confers a growth advantage. The ability to identify 
putative oncogenes that are activated during tumori gen- 
esis could facilitate the choice of molecular genetic 
targets for diagnosis and therapy of the disease. This 
concept has been exemplified by HER-2, which was first 
found to be amplified in neuroblastomas and subse- 
quently shown to be associated with poor prognosis in 
breast cancer (Ross and Fletcher, 1999). Now, HER-2 
aberrations are used as a predictor of response to 
therapy, and treatment of HER-2-positive breast cancer 
with the monoclonal anti-HER-2 antibody trastuzumab 
has been shown to improve prognosis (Ross and 
Fletcher, 1999). Emerging evidence of common ampli- 
cons in lung adenocarcinomas (Luk et al. t 2001; Jiang 
et al., -2004; Tonon et aL, 2005) suggests that additional 
oncogenes remain to be identified; however, conven- 
tional techniques are ineffective in pinpointing such 
oncogenes. Parallel measurement of DNA copy number 
and mRNA levels in cDNA microarrays permits 
changes in copy number to be compared with transcrip- 
tion levels on a gene- by-gene basis to generate lists of 
candidate genes within the defining amplicons (Hyman 
et al. % 2002; Pollack et al. t 2002). However, use of 
transcript patterns does not allow assessment of the 
expression of protein products or identification of proto- 
oncogenes. Another approach, identifying differentially 
expressed proteins by proteomic analysis and then 
comparing the proteins present with mRNA expression 
in cDNA microarrays from the same specimens, can 
clarify the extent to which changes in transcript patterns 
reflect changes in their cognate proteins and post- 
transcriptional mechanisms (Chen et al., 2002), but this 
; approach cannot be used to identify oncogenes driven 
by extensive increases of their gene copy number. 
Moreover, using individual microarrays or proteomic 
approaches alone cannot distinguish the cancer-driving 
oncogenes that directly propel tumor progression from 
the larger number of passenger genes that may be 
concurrently over-represented but are not biologically 
relevant in tumor development. 



In this study, we used a comprehensive approach that 
integrated simultaneous comparative genomic hybridi- 
zation (CGH) and transcript microarray with proteomic 
analyses of six lung adenocarcinoma cell lines. We 
directly and specifically identified four putative onco* 
genes that could have been activated through amplifica- 
tion and consequent elevation of transcript expression. 
We used small interfering RNA (siRNA) to inhibit the 
expression of two of these four genes in the lung cancer 
cell lines, which further implicated them in oncogenesis. 
We then explored the clinical significance of these 
findings by assessing the expression of these two genes 
in tissue microarrays of human lung cancer specimens. 
Our findings underscore the power of integrated 
functional genomic analyses for identifying putative 
oncogenes in tumorigenesis; such activated genes could 
be useful as targets for diagnosis or therapy in lung 
cancer. 



Results 

Simultaneous global genomic and transcript analyses 
identify 183 genes with increases in genomic copy 
numbers and transcript expression levels 
To identify genes in which increased DNA copy number 
might contribute to increased transcript in lung adeno- 
carcinomas, first we used CGH with microarrays of six 
lung adenocarcinoma cell lines. We identified 587 genes 
showing increases in DNA copy number across all six 
cell lines (Supplementary Table IS), which were 
distributed as 90 amplicons on all chromosomes except 
for chromosomes 13 and Y (Supplementary Table 2S). 
A subsequent transcript test with the identical arrays of 
the same cell lines revealed 275 genes that showed 
increased mRNA levels (Supplementary Table 3S). 
Using random permutation tests across all cancer cell 
lines, we identified 183 genes (31%) that showed 
elevated transcript levels from the 587 genes that were 
over-represented in the genome (Table 1), suggesting 
that elevated transcript levels of the 183 genes may 
reflect their genomic over-representation in the cancer 
cells. These findings are consistent with previous reports 
linking genomic changes with altered transcript patterns 
in breast cancer (Hyman et al. $ 2002; Pollack et a/., 
2002). However, our finding that only 31% of the genes 
showing increased DNA copy numbers had cognate 
increases in transcript expression in lung adenocarcino- 
mas is different from the overall rates of 40-60% 
reported for breast cancer (Hyman et a!. t 2002; Pollack 
et al, 2002). This discordance may reflect method o logic 
differences between studies or biological differences 
between breast cancer and lung adenocarcinoma. 

Proteomic analyses identify four genes for which protein 
abundance was associated with increases in the cognate 
gene and transcript levels 

Analysis of transcript patterns is insufficient for under- 
standing the expression of protein products and the 
effect of genomic over-representation on the expression 
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of their cognate proteins. To extend these findings 
beyond genomic over-representation to expression of 
the protein products of those genes, we next assessed 
protein expression in the same cell lines by two- 
dimensional polyacrylamide gel electrophoresis (PAGE) 
and found that 42 different proteins, representing 42 
individual genes, were significantly increased in the 
cancer cell lines (Table 2; Supplementary Figures IS and 
2S). Some of these proteins were identified as having 
multiple i so forms, and all individual i so forms exhibited 
increases in expression ranging from 4.6 to 12.8 times 
their expression in normal lung tissue cells. In compar- 
ing protein level of the 42 genes with changes in their 
cognate genomic and mRNA expression from the global 
microarray analyses, we found that four (9.5%) of those 
42 genes - PRDX1, EEF1A2, CALR, and KCTP-I - 
showed statistically significant correlations between 
elevated protein expression and increases in both copy 
number and mRNA expression (all r>0.84; P<0.05) 
(Table 2) in the cancer cell lines. These findings imply 
that the abundance of these four proteins is attributable 
to the amplification and consequent elevated transcrip- 
tion of their cognate genes. 

Validation of copy number, transcript, and protein 
expression ofPRDXl, EEFJA2, CALR, and KCIP-1 
in lung cancer cell lines 

To confirm our findings from the high- throughput 
analyses, we next used Southern, Northern, and Western 
blotting to assess DNA, RNA, and protein levels for the 
four genes identified in the six cell lines. For compar- 
ison, we arbitrarily chose one gene, NFKB1, in which an 
increase in protein level did not correlate with genetic 
changes. Overall, we found excellent concordance 
between the CGH microarray and Southern blotting 
analyses, transcript array and Northern blotting ana- 
lyses, and proteomic and Western blotting analyses for 
all five genes (Figure 1). For example, KCIP-l showed 
fivefold amplification in five of the six cancer cell lines, 
whereas NFKB1 showed no such increase in any of the 
cell lines. As for transcript expression, Northern blotting 
of EEF1A2 showed high expression in five of the six 
cancer cell lines; again, levels of NFKB1 transcript were 
not increased in any cancer cell line as compared with 
normal bronchial epithelial cells. The results of Western 
blotting were also consistent with the results of the 
proteomic experiments; for example, five of the cancer 
cell lines exhibited strong protein bands for PRDX1 as 
compared with normal cells. These findings provide 
strong support for the validity of the results derived 
from the high-throughput techniques in this study. 

These parallel analyses also revealed close correla- 
tions in the extent of changes in gene copies, transcript, 
and protein of each of the four genes in the cancer cell 
lines. For example, in the five cancer cell lines that 
showed at least fourfold increases in EEFIA2 copy 
number, expression of transcript and protein was also 
increased by at least a factor of four as well (relative to 
their expression in normal cells) (Supplementary Figure 
3S). The protein abundance of the four genes showing 
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Table 1 List of 183 genes with statistically significant correlation Table 1 {continued) 
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corresponding increases in both DNA copy number and 
mRNA provides further evidence that these could be 
oncogenes, the activation of which is reflected by 
genomic amplification and consequent increases in 
transcript level in lung adenocarcinoma cell lines. 

Specific inhibition of EEF1A2 and KCIP-1 expression by 
siRNAs led to decreased cell proliferation and induction of 
apoptosis 

To further prove the oncogenic function of the identified 
genes in lung tumorigenesis, we used siRNAs to inhibit 
the endogenous expression of EEF1A2 and KCIP-1 
protein in four lung cancer eel J lines (HI 563, H229, 
H522, and SK-LU). Transfection of the cancer cells with 
specific siRNAs reduced the level of EEFJA2 and 
KCIP-l protein by 70-90% 48 h after transfeciion 
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(Supplementary Figure 4S). In contrast, EEF1A2 and 
KCTP-1 protein levels remained unchanged in mock- 
treated control cells and in cells transfected with a 
scrambled siRNA sequence. At 48 h after siRNA 
transfection, the percentage of proliferation of the 
transfected cancer cells was reduced to lS-30% as 
compared with 91-100% of cell proliferation of the 
same cell lines treated with PBS or scrambled siRNA 
(Supplementary Figure 5S). Apoptosis of siRNA- 
transfected cells was 27-34%, whereas only 4% of the 
same cell lines treated with PBS or scrambled siRNA 
showed apoptosis. These results strongly support an 
oncogenic role for the identified genes in lung cancer and 
confirm their potential usefulness as therapeutic targets 
for the disease. 



Amplification and protein expression of KCIP-l and 
EEF1A2 in lung tissue 

To further validate these findings and to assess the 
possible clinical significance of the four potential 
putative oncogenes identified from the eel! lines, we first 
applied fluorescence in situ hybridization and immuno- 
histochemical analysis, in parallel, to commercially 
available human lung tissue microarrays (Ambion, 
Austin, TX, USA) to evaluate the status of two of these 
four genes in lung cancer tissue specimens. (Commer- 
cially available antibodies to PRDX1 or CALR were 
not suitable for use in immunohistochemical analysis 
when this report was written.) Overexpression of KCIP- 
l and EEF1A2 protein in the tumors was concordant 
with amplification of the corresponding genes 
(P = 0.0003 for KCIP-l and /*= 0.0011 for EEF1A2). 
For example, 16 (35%) of the 46 lung adenocarcinomas 
in the microarray showed amplification of KCfP-l, and 
strong cytoplasmic staining for KCIP-l protein was seen 
in 18 tumors (39%) (Figure 2). We next examined 
whether overexpression of these genes was associated 
with increased cell proliferation by analysing Ki-67 
expression in contiguous sections of the tissue micro- 
arrays. Positive Ki-67 expression was found to correlate 
with positive expression of both KCIP-l (/> = 0.02) and 
EEF1A2 (/? = 0.01). To extend these findings, we then 
studied 1 1 tissue microarray blocks comprising normal 
and tumor tissue specimens from 113 patients with 
pathologic stage I non-small-cell lung cancer who had 
undergone curative surgery (Wang et al. t 2005). 
Immunohistochemical analysis showed that EEFIA2 
was expressed in 32 cases (28%) and KCIP-l in 29 cases 
(26%). Univariate and multivariate Cox proportional 
hazards models were used to detect possible associations 
between EEFIA2 and KCIP-l expression and clinico- 
pathologic variables. Expression of EEF1A2 or KCIP-l 
was associated with short overall survival time 
(P = 0.0012 for EEF1A2 and P = 0.0026 for KCIP-l) 
(Supplementary Figure 6S). Age at diagnosis, histologic 
type of cancer, degree of tumor differentiation, and 
smoking history were not associated with survival time. 

Although only two genes were validated in the lung 
tissue microarrays (because available antibodies to 
the other two genes were not suitable for use in 
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Table 2 Proteins showing significant overexpression in cancer eel) Uois relative to those in normal bronchial epithelial cell Lines and their 

correlation coefficients with increased pNA copy number or mRNA values* 
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r with geno- 
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64.9/4-5 


Cell division cycle 25B 


0.0451 16 


0.283214 


P50290 


998 


CDC42 


213/6.1 


Cell division cyde 42 (GTP-binding protein, 25 kDa) 


-0.47636 


0.088108 


P6I5S6 


387 


RHOA 


19.8/6.9 


Ras homolog gene family, member A 


-0.49782 


-0.00544 


P63000 


5879 


RACI 


21.5/6.8 


Ras- related C3 botulinum toxin substrate 1 


-0.05583 


-0.03566 


P07437 


203068 


TUBB 


49.6/6.5 


Tubulin, beta polypeptide 


0.255533 


0.145010 


P24864 


898 


CCNE1 


47.1/4.3 


Cyclin El 


-0.65116 


0.232149 


P04141 


1437 


CSF2 


16.9/6.3 


Colony stimulating factor 2 (granulocyte-macrophage) 


-0.64636 


-1.28108 


P28072 


5694 


PSMB6 


25.3/5.2 


Proteasome (prosome, macropain} subuntt, beta type, 6 


-0.69782 


-1.30544 


P00352 


216 


ALD- 
H1A1 


54.7/4.3 


Aldehyde dehydrogenase 1 family, member Al 


-0.75872 


0.03356 


Q030I3 


2948 


GTM4 


25.3/5.0 


Glutathione 5-transferasc M4 


-0.78533 


0.134501 


P63241 


1984 


EIF5A 


10/4.4 


Eukaryotic translation initiation factor 5A 


-0.97893 


-1.44321 


Q01469 


2171 


EFABP 


18.0/4.2 


Fatty acid-binding protein 5 


0.25684 


-0.36432 



■Only the gene showing statistically significant increased protein expression with increases in both genomic copy number and transcript 
simultaneously will be considered as potential putative oncogene in lung adenocarcinoma cells. b r, Spearman correlation coefficients between 
proteins and genomic or mRNA values are based on all six cancer cell lines; bold indicates / > <0.05, if r> 0.84000. Mw, molecular weight; pf, 
isoelectric point 



immunohistochemical analysis), these findings are con- 
sistent with those from our cell lines, demonstrating 
again that genomic amplification and consequent 
increases in amounts of transcript may be, at least in 
part, driving the abundance of proteins in these lung 
tumors. The association between expression of these 
genes and that of Ki-67, a known indicator of poor 
prognosis in lung cancer (Martin et aL t 2004), suggests 
that activation of these genes may be an indicator of 
tumor aggressiveness. These results also suggest that 
expression of EEF1A2 and KCIP-l proteins in stage I 
non-smaU-cell lung cancer may be useful as a marker for 
distinguishing patients with relatively poor prognosis 
from those who might benefit from adjuvant treatment. 



Discussion 

Our current study illustrates the power of integrated 
functional genomic analyses for identifying putative 
oncogenes and for evaluating their potential clinical 
significance. Among the four identified oncogenes, three 
genes {PRDXl, CALR, and KCIP-1) have been im- 
plicated in lung tumori genesis. PRDXl is an antioxidant 
protein involved in regulating cell proliferation, differ- 
entiation, and apoptosis. KJm et al. (2003) found 
PRDXl expression to be elevated in both lung cancer 
and adjacent normal lung tissue, suggesting that 
activation of PRDXl may enhance proliferation in lung 
cancer. CALR has a major role in Ca 2+ binding and the 
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Figure 1 Confirmation by Southern, Northern, and Western blot 
analyses of increased DNA copies, transcript levels, and protein 
levels in the four genes identified in high-throughput analyses. For 
comparison, we arbitrarily chose one gene, NFKBI, in which an 
increased protein level did not correlate with genetic changes. The 
bloiting results are consistent with the results from the CGH array, 
transcript array, and proteomic analyses. Nor, indicates normal 
bronchial epithelial cell line. Ail the experiments were repeated at 
least three times with each cell line. Means of normalized to 0-actin 
signal intensities on Southern, Northern, and Western blots, along 
with 95% confidence intervals, were calculated (/f-actin signals are 
not shown in the figure; two different normal bronchial epithelial 
cell lines were used in the confirmation and only one normal cell 
line is shown in the figure). 



transcriptional regulation of other genes and was 
recently found to be overexpressed in 73% of 40 lung 
adenocarcinomas (Oates and Edwards, 2000). KCIP-1 
belongs to the 14-3-3 family, which participates via the 
MAPK and Wnt signaling pathways in the regulation of 
many cellular processes including cell proliferation and 
differentiation as well as tumorigenesis (Thomas et al., 
2005). KCIP'l was recently found to be expressed in all 
12 lung tumors tested in a single-institution study (Qi 
et al., 2005). Interestingly, EEFIA2 was originally 
considered a putative oncogene in ovarian cancer on 
the basis of its being amplified in 25% and over* 
expressed in 30% of the same set of ovarian tumors ( 
(Anand et al. % 2002); functional analyses have estab- 
lished its oncogenic role in cellular transformation (Lee, 
2003). Our discovery that EEFIA2 may be a putative 
oncogene in lung adenocarcinoma demonstrates the 
power of our functional genomic strategy for rapidly 
identifying potential oncogenes. 

Although the main focus of this study was to 
specifically identify putative oncogenes, it should be 



noted that 90.7% of the genes showing high protein 
expression did not show corresponding increases 
in both DNA copy number and transcript, a finding 
consistent with that of others that transcriptional, 
^ translational, and post-translational regulatory mecha- 
nisrris can greatly influence the abundance of protein 
in lung tumorigenesis (Chen et a/., 2002). For example, 
NFKBI is a critical arbiter of immune responses, 
cell survival, and transformation and is often activated 
in several types of tumors (Chen et a/., 2002). De- 
regulation of NFKBI is thought to be modulated 
through phosphorylation of Ser337 by protein kinase 
A (Chen et al, 2002). In our study, 68.8% of the 
genes showing over-representation in the genome 
did not show elevated transcript levels, implying 
that at least some of these genes are 'passenger* genes 
that are concurrently amplified because of their 
Location with respect to amplicons but lack bio- 
logical relevance in terms of the development of lung 
adenocarcinoma. 

Although the potential oncogenes we identified here 
are likely to be important, certainly other oncogenes 
could be involved in the development of lung adeno- 
carcinoma. The oligo microarray we used consists of 
22 000 probes, which represent only about 60% of the 
human genome. Moreover, each probe was designed for 
the 3' region of expressed sequence tags of the selected 
genes. Also, our results were initially derived from 
cancer cell lines, although the findings were later 
confirmed in human tissue samples. Our ongoing study 
using microarrays with information on more genes 
and the development of high-resolution proteomic 
analyses for use with larger numbers of specimens will 
allow more comprehensive analyses of the molecular 
.consequences of gene amplifications. Such expanded 
■analyses will very likely lead to the identification of 
additional oncogenes. 

Some of the results of our current study were 
comparable to those of other studies of lung cancer. 
For example, genomic copy number and protein levels 
of KCIP-1 were previously found to be amplified and 
overexpressed in primary lung cancers by cDNA clone- 
based CGH array analysis (Jiang et al, 2004) and 
proteomic analysis (Chen et al, 2002), respectively. Our 
functional genomic approach, which integrates simulta- 
neous CGH, transcript microarrys, proteomic analyses, 
and siRNA, allows us not only to quickly identify 
potential oncogenes but also to explore their significance 
as diagnostic and therapeutic targets in tumor progres- 
sion — more than could be achieved by any technique 
alone. 

Genes identified in this way may serve as promising 
targets for diagnosis and therapy in lung adenocarci- 
noma. Further research on the clinical implications of 
such genes is needed; experiments now underway in our 
laboratory include overexpression of the genes in 
normal cells, disruption of the function of these genes 
in cancer cells, and investigation of how interactions 
among these genes (or interactions with other known 
oncogenes) may mediate the expression of the trans- 
formed phenotype. 
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Figure 2 EEFIA2 amplification is associated with high EEF1 A2 protein expression in lung adenocarcinomas, (a) Cells from a lung 
adenocarcinoma sample in which EEFIA2 is amplified show more green signals {EEFJA2) than red signals (chromosome 20 
ccntromertc probe) (original magnification, x 400). (b) Immunohistochcmical staining of cells from the same tissue sample as in panel 
a shows strong EEFI A2 staining in the cytoplasm, (c) A lung adenocarcinoma sample with two copies of EEFIA2 and chromosome 20 
centromeric probe, indicating no EEFIA2 amplification (original magnification, x 400). (d) I ramunohisto chemical staining of cells 
from the same tissue sample as in panel c shows negative staining for EEF1A2. 



Materials and methods 

Cell lines 

Six human lung adenocarcinoma cell lines (H23, H229, HI 792, 
SK-LU-l, H522, and HI 563) were obtained from the 
American Type Culture Collection (Manassas, VA, USA). 
Two normal bronchial epithelial cell lines were obtained from 
Ciontech (Palo Alto, CA, USA). Genomic DNA, mRNA, and 
protein were derived from a single harvest of these cells. 

DNA and RNA profiles by microarray analysis 
Genomic DNA labeling and hybridization were performed as 
described previously (Barrett et al. % 2004) with Agilent's 
Human I A Oligo Microarray (V2) (Agilent Technologies, 
Palo Alto, CA, USA), which contains 22000 unique 60-mer 
oligos. Details of the protocol for analysing transcripts are 
available at http://wwwxheni.agilent.coni. Map positions for 
arrayed genes were assigned by identifying the DNA sequence 
represented in the UniGene cluster and matching it with the 
Golden Path genome assembly (http://genome.ucsc.edu/; Mat 
7, 2004 Freeze). Microarray images of DNA copy number and 
expression were analysed by using AgilentCGH Analytics and 
Feature Extraction software. DNA copy number profiles that 
deviated significantly from background signal ratios (measured 
from normal control cell hybridization, as described elsewhere; 
Barrett et a/., 2004) were interpreted as evidence of true 
differences in DNA copy number. The criteria for defining 
genomic over-representation and amplicons are described 
elsewhere (Hyman et a/, t 2002); details are given in the 



Supplementary Information. An increase in mRNA level was 
defined as a twofold increase in signal ratio relative to that of 
the control (log 2 > 1). 

Quantitative two-dimensional PAGE and mass spectrometry 
Analysis of proteins by two-dimensional PAGE and their 
identification by mass spectrometry were performed as 
previously described (Shen et al. % 2004). Briefly, protein pellets 
were solubilized in rehydration butter, after which the first- 
dimension isoelectric focusing was carried out with a Protean 
IEF Ceil (Bio-Rad Laboratories) and the second-dimension 
separation was carried out with Bio-Rad's Ready Gel Precast 
Gels and the Bio-Rad Criterion Cell apparatus. Protein spots 
were visualized by silver-based staining, and all gels were 
assessed with Bio-Rad's PDQuest 2D gel image analysis 
software. Selected spots were subjected to in-gcl tryptic 
digestion and analysed on a Voyager-DE PRO matrix-assisted 
laser desorption ionization/time-of- flight mass spectrometer 
(Applied Biosystems, Foster City, CA, USA). The mass list of 
the 20 most intense monoisotopic peaks for each sample was 
entered in the MS-Fit search program (v3.2.1) (http:// 
prospector.ucsf.edu/ucsfhtml4.0/msfit.htm) and searched in 
the National Center for Biotechnology Information protein 
database. 

Southern, Northern, and Western blot analyses 
Southern, Northern, and Western blot hybridizations were 
performed according to standard protocols. cDNA clones for 
the tested genes were purchased from Invitrogen (Carlsbad, 
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CA, USA) and prepared as probes for the blot hybridizations. 
Antibodies used were obtained as follows: PRDX1, CALR, 
NFKB1, KCIP-I, and 0-actin from Santa Cruz Biotechnology 
(Santa Cruz, CA, USA); and EEF1A2 from Upstate Biotech- 
nology (Walthara, MA f USA). 

Fluorescence in situ hybridization and immunohistochemicat 
analyses of lung tissue microarrays 

Fluorescence in situ hybridizations and irnmunohistochemical 
analyses of KCIP-1 and EEF1 A2 were carried out as described 
elsewhere (Jiang et aL t 2002; Wang el a!., 2005) with Lung 
Tissue Microarrays (Ambion, Austin, TX, USA) and 11 
homemade microarray blocks containing tissue samples from 
113 patients with pathologic stage I non-small-cell lung cancer 
(Wang et al, 2005). DNA probes specific for KCIP-l and 
EEFIA2 were obtained by screening a Human BAC Clone 
library (Invitrogcn) by polymerase chain reaction as described 
previously (Jiang et al, 2002). The antibodies used for the 
irnmunohistochemical analyses were the same as those used 
for the Western blotting. Cell proliferation of the lung tissues 
was assessed with a K.i-67 monoclonal antibody from Santa 
Cruz Biotechnology. Definitions of the cutoff value for a 
positive result of each antibody are shown in Supplementary 
Information. 

siRNA transfection, cellular proliferation assay, and apoptosis 
analysis 

Transfections were carried out by using siPORT Lipid 
Transfcction Agent (Ambion) with siRNAs targeting K.CIP-1 
or EEF1A2 or with a scrambled siRNA duplex (siControl) 
(Dharmacon Inc., Lafayette, CO, USA), with PBS used as a 
negative control (Jiang et al, 2002). Cells were fixed 24, 48, or 
96 b later and subjected to further tests. Ail siRNAs were 
prepared by using a transcription-based method with Silencer 
siRNA according to the manufacturer's instructions (Am- 
bion). Sequences of the individual siRNAs are listed in 
Supplementary Table 4S. Inhibition of .cell growth by the 
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were examined as described elsewhere (Hyman et a/., 2002, 
Supplementary Information). Correlations between protein 
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the corresponding genes were evaluated with the Spearman 
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